Lecture Notes in 
Computer Science 1872 



J. van Leeuwen O. Watanabe 
M. Hagiya P.D. Mosses T. Ito (Eds.) 



Theoretical 
Computer Science 

Exploring New Frontiers 
of Theoretical Informatics 

International Conference IFIP TCS 2000 
Sendai, Japan, August 2000 
Proceedings 




IFIP TCI 




Springer 





Lecture Notes in Computer Science 1872 

Edited by G. Goes, J. Hartmanis and J. van Leeuwen 




springer 

Berlin 

Heidelberg 

New York 

Barcelona 

Hong Kong 

London 

Milan 

Paris 

Singapore 

Tokyo 




Jan van Leeuwen Osamu Watanabe 
Masami Hagiya Peter D. Mosses 
Takayasu Ito (Eds.) 



Theoretical 
Computer Science 

Exploring New Frontiers 
of Theoretical Informatics 



International Conference IFIP TCS 2000 
Sendai, Japan, August 17-19, 2000 
Proceedings 




Springer 




Series Editors 

Gerhard Goos, Karlsruhe University, Germany 
Juris Hartmanis, Cornell University, NY, USA 
Jan van Leeuwen, Utrecht University, The Netherlands 

Volume Editors 
Jan van Leeuwen 

University of Utrecht, Department of Computer Science 
Centrumgebouw Noord, Office A309 
Padualaan 14, De Uithof, Utrecht, The Netherlands 
E-mail: jan@cs.uu.nl 

Osamu Watanabe 

Tokyo Institute of Technology, Department of Information Science 
Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan 
E-mail: watanabe@is.titech.ac.jp 
Masami Hagiya 

The University of Tokyo, Graduate School of Science 

Department of Information Science, Bunkyo-ku, Tokyo 1 13-0033, Japan 

E-mail:hagiya@is.s.u-tokyo. ac.jp 

Peter D. Mosses 

University of Aarhus, Department of Computer Science 
Ny Munkegarde, Bldg. 540, 8000 Aarhus C, Denmark 
E-mail: pdmosses@daimi.au.dk 
Takayasu Ito 

Tohoku University, Graduate School of Information Sciences 

Department of Computer and Mathematical Sciences, Sendai, Japan 980-8579 

E-mail: ito@ito.ecei.tohoku.ac.jp 

Cataloging-in-Publication Data applied for 

Die Deutsche Bibliothek - ClP-Einheitsaufnahme 

Theoretical computer science : exploring new frontiers of theoretical 

informatics ; international conference ; proceedings / IFIP TCS 2000, 

Sendai, Japan, August 17 - 19, 2000. J. van Leeuwen . . . (ed.). - 
Berlin ; Heidelberg ; New York ; Barcelona ; Hong Kong ; London ; 

Milan ; Paris ; Singapore ; Tokyo : Springer, 2000 
(Lecture notes in computer science ; Vol. 1872) 

ISBN 3-540-67823-9 

CR Subject Classification (1998): F, D.3, E.l, 1.3, C.2 
ISSN 0302-9743 

ISBN 3-540-67823-9 Springer- Verlag Berlin Heidelberg New York 

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is 
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, 
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication 
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, 
in its current version, and permission for use must always be obtained from Springer- Verlag. Violations are 
liable for prosecution under the German Copyright Law. 

Springer- Verlag Berlin Heidelberg New York 

a member of BertelsmannSpringer Science-t-Business Media GmbH 
© Springer-Verlag Berlin Heidelberg 2000 
Printed in Germany 

Typesetting: Camera-ready by author, data conversion by PTP-Berlin, Stefan Sossna 
Printed on acid-free paper SPIN: 10722337 06/3142 5 4 3 2 1 0 




Foreword 



In 1996 the International Federation for Information Processing (IFIP) establis- 
hed its first Technical Committee on foundations of computer science, TCI. The 
aim of IFIP TCI is to support the development of theoretical computer science 
as a fundamental science and to promote the exploration of fundamental con- 
cepts, models, theories, and formal systems in order to understand laws, limits, 
and possibilities of information processing. 

This volume constitutes the proceedings of the first IFIP International Con- 
ference on Theoretical Computer Science (IFIP TCS 2000) - Exploring New 
Frontiers of Theoretical Informatics - organized by IFIP TCI, held at Tohoku 
University, Sendai, Japan in August 2000. 

The IFIP TCS 2000 technical program consists of invited talks, contributed 
talks, and a panel discussion. In conjunction with this program there are two 
special open lectures by Professors Jan van Leeuwen and Peter D. Mosses. 

The decision to hold this conference was made by IFIP TCI in August 1998, 
and since then IFIP TCS 2000 has benefited from the efforts of many people; 
in particular, the TCI members and the members of the Steering Committee, 
the Program Committee, and the Organizing Committee of the conference. Our 
special thanks go to the Program Committee Co-chairs: 

Track (1): Jan van Leeuwen (U. Utrecht), Osamu Watanabe (Tokyo Inst. Tech.) 
Track (2): Masami Hagiya (U. Tokyo), Peter D. Mosses (U. Aarhus). 

The details of the conference were planned by the Steering Committee with 
the help of the PC Co-chairs. The Steering Committee members are 

Giorgio Ausiello (U. Roma ”La Sapienza”) Chair 
Wilfried Brauer (TU Miinchen) 

Takayasu Ito (Tohoku U.) 

Michael O. Rabin (Harvard U.) 

John Staples (U. Queensland) 

Joseph Traub (Columbia U.) 

Professor Michael O. Rabin of Harvard University accepted our invitation to 
be the banquet speaker of IFIP TCS 2000. 

IFIP TCS 2000 and the IFIP Technical Committee TCI gratefully acknow- 
ledge the partial support provided by Tohoku University and Sendai Tourism & 
Convention Bureau, and the cooperation of the following organizations: 

Information Processing Society of Japan 

Japan Society of Software Science and Technology 

European Association of Theoretical Computer Science 

Association of Symbolic Logic 

Association for Computing Machinery-SIGACT 
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Foreword 



We would like to thank Plamen Nedkov and Dorothy Hayden of the IFIP 
office for their assistance on IFIP procedures, and we also thank Alfred Hoffman 
of Springer- Verlag for his assistance in the publication of the proceedings. Ma- 
sahiko Ohtomo, Shinya Miyakawa, Nobuto Izumi, and Takuya Ohishi of Tohoku 
University lent their assistance in making local arrangements and in preparing 
the proceedings and the conference web pages. Finally, we would like to express 
our sincere thanks to all those who helped organize, and participated in, the 
IFIP TCS2000 conference for their invaluable contributions. 



May 2000 Giorgio Ausiello 

Takayasu Ito 
Conference Co-chairs 




Preface 



IFIP TCS 2000 is the first international conference organized by IFIP TCI, whose 
activities cover the entire field of theoretical computer science. Reflecting the cur- 
rent activities in theoretical computer science the major topics of the conference 
were chosen, forming the two tracks: 

Track (1) on Algorithms, Complexity, and Models of Computation, 

Track (2) on Logic, Semantics, Specification, and Verification. 

The program of IFIP TCS 2000 included the presentations of eighteen contri- 
buted papers of Track (1) and fourteen contributed papers of Track (2). The 
Program Committee selected them from forty submissions to Track (1) and 
thirty submissions to Track (2), with the help of additional referees. 

The invited speakers consist of the three keynote plenary invited speakers, 
and the three invited speakers for each track; they were chosen by the Steering 
Committee and the PC Co-chairs. 

This volume constitutes the record of the technical program, consisting of 
the contributed papers and the invited talks. We had the pleasure of chairing 
the program committee of the first IFIP International Conference on Theoretical 
Computer Science with collaboration of the conference chairs. We are extremely 
grateful to Takayasu Ito and his staff, who helped us in making and announcing 
the call for papers, the program, and their web pages, and in preparing the 
proceedings. 

We would like to express our thanks to the other members of the Program 
Committee and the additional referees, who are listed below, for their help in 
reviewing all submissions and for selecting the papers. 
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Co-chairs 
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(The Computational Soundness of Formal Encryption) 
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Abstract. Two distinct, rigorous views of cryptography have developed 
over the years, in two mostly separate communities. One of the views re- 
lies on a simple but effective formal approach; the other, on a detailed 
computational model that considers issues of complexity and probability. 
There is an uncomfortable and interesting gap between these two appro- 
aches to cryptography. This paper starts to bridge the gap, by providing 
a computational justification for a formal treatment of encryption. 



1 Two Views of Cryptography 



A fairly abstract view of cryptographic operations is often adequate for the 
design, analysis, and implementation of systems that use cryptography. For ex- 
ample, it is often convenient to ignore the details of an encryption function, and 
to work instead with a high-level description of what encryption is supposed to 
achieve. 

At least two distinct abstract views of cryptographic operations have develo- 
ped over the years. They are both consistent and they have both been useful, but 
they come from two mostly separate communities and they are quite different. In 
one of them, cryptographic operations are seen as functions on a space of sym- 
bolic (formal) expressions; their security properties are also modeled formally 
(e.g., |1 byidri.'tfl ,SI27f251'tl)|f)f,'f4f22p2H29j ). In the other, cryptographic operati- 
ons are seen as functions on strings of bits; their security properties are defined 
in terms of the probability and computational complexity of successful attacks 

(e.g., [l^|Utiy.|iiHniidEEEI)- 



There is an uncomfortable gap between these two views. In this paper, we 
call attention to this gap and start to bridge it. Representing the two views, we 
give two accounts of symmetric (shared-key) encryption: a simple one, based on 
a formal system, and a more elaborate one, based on a computational model. 
Our main theorem is a soundness result that relates the two accounts. It esta- 
blishes that secrecy properties that can be proved in the formal world are true 
in the computational world. Thus, we obtain a computational justification for 
the formal treatment of encryption. 



J. van Leeuwen et al. (Eds.): IFIP TCS2000, LNCS 1872, pp. 3-^^ 2000. 
@ Springer- Verlag Berlin Heidelberg 2000 
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As we relate the two accounts of encryption, we identify and make explicit 
some important choices. In particular, our main theorem excludes certain encryp- 
tion cycles (such as encrypting a key with itself); we argue that this restriction 
is reasonable and necessary, although formal approaches generally ignore it. We 
also consider, for example, whether two ciphertexts may manifest if they were 
produced using the same key. 

We believe that this paper suggests a profitable line of further research. It 
will take a significant research effort to relate the views of the people who invent, 
implement, break, and use cryptography. Continuing this work, it would be wor- 
thwhile to consider other cryptographic operations (such as signatures and hash 
functions), and to treat complete security protocols (such as key-distribution 
protocols) in addition to basic algorithms. 

Connections between the formal view and the computational view should 
ultimately benefit both: 

— These connections should strengthen the foundations of formal cryptology, 
and help in elucidating implicit assumptions and gaps in formal methods. 
They should confirm or improve the relevance of formal proofs about a pro- 
tocol to concrete instantiations of the protocol, making explicit requirements 
on the implementations of cryptographic operations. 

— Methods for high-level reasoning seem necessary for computational crypto- 
logy as it treats increasingly complex systems. Formal approaches suggest 
such high-level reasoning principles, and even permit automated proofs. In 
addition, some formal approaches capture naive but powerful intuitions ab- 
out cryptography; a link with those intuitions should increase the appeal 
and accessibility of computational cryptology. 

The next section is a more detailed discussion of the two views of cryptogra- 
phy; it also mentions related work. The rest of the paper proceeds as follows. 

In Section 0 we define a class of expressions and an equivalence relation on 
those expressions. The expressions represent data, of the sort used in messages 
in security protocols; the equivalence relation captures when two pieces of data 
“look the same” to an adversary, treating encryption as a formal operator. These 
definitions are simple and purely syntactic. In particular, they do not require 
any notion of probability or computational complexity. They are typical of the 
definitions given in formal treatments of cryptography, and directly inspired by 
some of them. 

Then, in Section 0 we present a computational model with strings of bits, 
probabilities, and complexities. In this model, we define secure encryption in 
terms of computational indistinguishability; our definition is similar, but not 
identical, to those of semantic security 1 1 SI7j . 

Finally, in Section I3 we relate equivalence to computational indistinguisha- 
bility. We associate a probability ensemble with each formal expression; our main 
theorem establishes that equivalent expressions induce computationally indistin- 
guishable ensembles. For example, the two expressions that represent two pieces 
of data encrypted under a fresh key will be equivalent. This equivalence can be 
read as a secrecy property, namely that the ciphertexts do not reveal the data. 
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Our main theorem implies that the two expressions correspond to computatio- 
nally indistinguishable ensembles. 

2 Background and Related Work 

This section explains the two views of cryptography, still informally. It points to 
a few examples of work informed by those two views; there are many more. It 
also describes some related research. 

The formal view. There is a large body of literature that treats cryptographic 
operations as purely formal. There, for example, the expression {M}x may 
represent an encrypted message, with plaintext M and key K. All of {M}k, M, 
and K are formal expressions, rather than sequences of bits. Various functions 
can be applied to such expressions, yielding other expressions. One of them is 
decryption, which produces M from {M}k and K. Crucially, there is no way 
to recover M or K from {M}jy alone. Thus, the idealized security properties of 
encryption are modeled (rather than defined). They are built into the model of 
computation on expressions. 

This body of literature starts with the work of Dolev and Yao [Ej, DeMillo, 
Lynch, and Merritt Millen, Clark, and Freedman |2S], Kemmerer {23 > Bur- 
rows, Abadi, and Needham [El, and Meadows |22|. It includes many different 
agendas and approaches, with a variety of techniques from the fields of rewriting, 
modal logic, process algebra, and others. Over the years, it has been used in the 
design of protocols, it has helped develop confidence in some existing protocols, 
and it has enabled the discovery of many attacks. It has also led to the deve- 
lopment of effective methods and tools for automated protocol analysis; Lowe’s 
and Paulson’s works are two recent examples of these advances 

This formal perspective is fairly easy to apply for the users of encryption, 
for example for protocol designers. It captures an important intuition: an en- 
crypted message reveals its plaintext only to those that know the corresponding 
decryption key, and it reveals nothing to others. This assertion is a simple (and 
simplistic) all-or-nothing statement, which can be conveniently built into a for- 
mal method. In particular, it does not require any notion of probability or of 
computational complexity: there is no need to say that an adversary may obtain 
some data but only with low probability or after an expensive computation. (Ho- 
wever, probability and computational complexity are compatible with formalism, 
as demonstrated by the work of Lincoln et al. m-) 

Those who employ the formal definitions often warn that a formal proof does 
not imply a guarantee of security. One of the reasons for this caveat is the gap 
between the representation of encryption in a formal model and its concrete 
implementation. At the very least, it is desirable to know what assumptions 
about encryption are necessary. Those assumptions have seldom been stated 
explicitly, and not in enough detail to permit systematic discussion and rigorous 
proofs. We aim to remedy this situation. 

A somewhat similar situation arises from the use of the random-oracle model 
in cryptography m- proofs that assume random oracles do not automatically 
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yield guarantees when the oracles are instantiated. However, we do not know of 
any natural examples where this gap has manifested itself. 

The computational view. Another school of cryptographic research is based on 
the framework of computational complexity theory. A typical member of that 
school would probably say that the formal perspective is naive and disconnec- 
ted from the realities of concrete cryptographic algorithms and protocols. Keys, 
plaintexts, and ciphertexts are all just strings of bits. An encryption function 
is just an algorithm. An adversary is essentially a Turing machine. Good pro- 
tocols are those in which adversaries cannot do “something bad” too often and 
efficiently enough. These definitions are all about success probabilities and com- 
putational cost. 

This computational view originates in the work of Blum and Micali nn, 
Yao |2Zj , and Goldwasser and Micali ^H| . It has strengthened the scientific foun- 
dations of cryptography, with a sophisticated body of definitions and theorems. 
It has also played a significant role in the development and study of particular 
protocols. 

As an important example of the computational approach, we sketch a no- 
tion of secure encryption. Specifically, we choose to treat symmetric encryption, 
following Bellare, Desai, Jokipii, and Rogaway 0- encryption scheme is de- 
fined as a triple of algorithms U — (/C,f,P). Algorithm /C (the key generator) 
makes random choices and then outputs a string k. Algorithm £ (the encryption 
algorithm) flips random coins r to map strings k and m into a string £k{m,r). 
Algorithm T> (the decryption algorithm) maps strings k and c into a string T>}^{c). 
We expect that T>k{£k{'rn,r)) = m for appropriate fc, m, and r. 

An adversary for an encryption scheme U = (K.,£,'D) is a Turing machine 
which has access to an oracle. We imagine realizing this oracle in one of two 
ways. In the first, the oracle chooses (once and for all) a random key k, and 
then encrypts each query x using £k and fresh random coins. In the second, 
the oracle chooses (once and for all) a key k, and then, when presented with a 
query x, encrypts a string of 0 bits of equal length, using fresh random coins. 
An adversary’s advantage is the probablity that the adversary outputs 1 when 
the oracle is realized in the first way minus the probablity that the adversary 
outputs 1 when the oracle is realized in the second way. An encryption scheme 
is regarded as good if an adversary’s maximal advantage is a slow-growing fun- 
ction of the adversary’s computational resources. This definition of security can 
be worked out rigorously and elegantly in both asymptotic and concrete ver- 
sions (see Section lOll . In any case, it is based on notions of probability and 
computational power. 



Related work. The desire to relate the two views of cryptography is not entirely 
new (e.g., pm). Nevertheless, there have been hardly any research efforts 
in this general direction. The work of Pfitzmann, Schunter, and Waidner PD 
(which is simultaneous to ours and independent) starts from motivations si- 
milar to our own. It proves that some reactive, cryptographic systems satisfy 
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high-level (non-cryptographic) specifications, under computational assumptions 
on cryptographic operations. These results do not concern a formal model of 
cryptography, such as the one studied in this paper, but the relation to a for- 
mal model of cryptography is mentioned as an interesting subject for further 
work. Also relevant is the work of Lincoln, Mitchell, Mitchell, and Scedrov 
which develops a rich process-algebraic framework that draws on both views of 
cryptography. Further afield, Abadi, Fournet, and Gonthier m and Lynch |2b| 
relate the formal view of cryptography with higher-level (non-cryptographic) de- 
scriptions of security mechanisms. Finally, Volpano and Smith analyze the 
complexity of attacking programs written in a simple, typed language; however, 
this language does not include cryptographic primitives. 

As we compare two accounts of encryption, we arrive at the concept of which- 
key concealing encryption, with which ciphertexts do not manifest whether they 
were produced using the same key (see Section l4.2j) . Indepedently and concur- 
rently, the work of Bellare, Boldyreva, Desai, and Pointcheval studies this con- 
cept from a different perspective |E|. 

3 Formal Encryption and Expression Equivalence 

In this section we present the formal view of cryptography, specifically treating 
symmetric encryption. We describe the space of expressions on which encryption 
operates, and what it means for two expressions to be equivalent. 

As explained in the introduction, the expressions represent data, of the sort 
used in messages in security protocols. Expressions are built up from bits and 
keys by pairing and encryption. The equivalence relation captures when two 
pieces of data “look the same” to an adversary that has no prior knowledge of 
the keys used in the data. For example, an adversary (with no prior knowledge) 
cannot obtain the key K from the ciphertexts {0}ic and {Ijx; therefore, the ad- 
versary cannot decrypt and distinguish these ciphertexts, so they are equivalent. 
Similarly, the pairs (0,{0}/c) and (0, {l}x) are equivalent. On the other hand, 
the pairs (K,{0}k) and (AT, {1}^) are not equivalent, since an adversary can 
obtain K from them, then decrypt {0}ic or {l}ic and obtain 0 or 1, respectively, 
thus distinguishing the pairs. In this section, we formalize these informal argu- 
ments about equivalence; the soundness theorem of Section 0 provides a further 
justification for them. 



3.1 Expressions 

We write Bool for the set of bits {0,1}. These bits can be used to spell out 
numbers and principal names, for example. We write Keys for a fixed, nonempty 
set of symbols disjoint from Bool. The symbols K, K' , K” , . . . and Ki , K 2 , ■ ■ ■ are 
all in Keys. Informally, elements of the set Keys represent cryptographic keys, 
generated randomly by a principal that is constructing an expression. Formally, 
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however, keys are atomic symbols, not strings of bits. We write Exp for the set 
of expressions defined by the grammarQ 



M,N 

K 

i 

{M,N) 

{M}k 



expressions 

key (for K G Keys) 
bit (for i G Bool) 
pair 

encryption (for K G Keys) 



Informally, (M, N) represents the pairing of M and N, which might be imple- 
mented by concatenation plus markers, and {M}k represents the encryption of 
M under K, which might be implemented using a symmetric algorithm like DES, 
in CBC mode and with a random initialization vector. Pairing and encryption 
can be nested, as in the expression {{{{0,K')}k}k', K). 

We emphasize that the elements of Exp are formal expressions (essentially, 
parse trees, abstract syntax trees) rather than actual keys, bits, concatenations, 
or encryptions. In particular, they are unambiguous: for example, (M,N) equals 
(M', N') if and only if M equals M' and N equals and it never equals 
{M'}k- Similarly, {M}k equals {M'}ki if and only if M equals M' and K 
equals K\ However, according to definitions given below, {M}k and {M'}k' 
may be equivalent even when M and M' are different and when K and K' are 
different. 

There are several possible extensions of the set of expressions: 



— We could allow expressions of the form where an arbitrary expression 

N is used as encryption key. 

— We could distinguish encryption keys from decryption keys, as in public-key 
cryptosystems. 



These extensions are useful in modeling realistic protocols, but would complicate 
our definitions and theorems. We therefore leave them for further work. 

It is also important to consider a restriction to the set of expressions. We 
say that K encrypts K' in M if there exists an expression N such that {N}k 
is a subexpression of M and K' occurs in N . For each M, this defines a binary 
relation on keys (the “encrypts” relation). We say that M is cyclic (or acyclic) 
if its associated relation is cyclic (or acyclic, respectively). For example, {K}k 
and {{K}k',{K'}k) are both cyclic, while {{K}k',{^}k) is acyclic. 

Cycles, such as encrypting a key under itself, are a source of errors in prac- 
tice (e.g., m) ; they also lead to weaknesses in common computational models, 

^ An equivalent way to define Exp is as the language generated by the context-free 
grammar with start symbol E, nonterminals E and K, terminals “0”, “1”, “(”, “)”, 
“f’l “{”i “}”) and the set of elements in Keys, and the productions: 



E-^0 I 1 I (E,E) I K I {E}k 
K — > K for each K G Keys 
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as explained in Sectional Moreover, cycles can often be avoided in practice — 
and they should generally be avoided given what is, and is not, known about 
them. The soundness theorem of Section 0 deals only with acyclic expressions. In 
contrast, cycles are typically permitted (without discussion) in formal methods. 

3.2 Equivalence 

Next we give a formal definition of equivalence of expressions. It draws on defi- 
nitions from the works of Syverson and van Oorschot, Schneider, Paulson, and 
others EaSM. Some of the auxiliary definitions concern how expressions can 
be analyzed and synthesized; such definitions are quite common in formal me- 
thods for protocol analysis. Equivalence relations are useful in semantics of modal 
logics: in such semantics, one says that two states in a computation “look the 
same” to a principal only if the principal has equivalent expressions in those 
states. Equivalence relations also appear in bisimulation proof techniques 
where one requires that bisimilar processes produce equivalent messages. 

First, we define an entailment relation M \- N , where M and N are expres- 
sions. Intuitively, M \- N means that N can be computed from M. Formally, we 
define the relation inductively, as the least relation with the following properties: 

— M h 0 and M h 1 , 

— M h M, 

— if M h and M h iV2 then M h (iVi, N2), 

— if M h (Ni,N2) then M \- Ni and M h N2, 

— ii M h N and M K then M h {N}k, 

— if M h {N}k and M \~ K then M \- N. 

This definition oi M \- N models what an attacker can obtain from M without 
any prior knowledge of the keys used in M . For example, we have 

{{{Ki}k.^}k3,Ks) I- Kz 



and 

({{^l}if2}i^3>^3) I" {Ki}k2 

but not 

-^3) 1“ Ki (false) 

It is simple to derive a more general definition from this one: obtaining N from 
M with prior knowledge of K is equivalent to obtaining N from (M, K) with no 
prior knowledge. 

Next, we introduce the box symbol □, which represents a ciphertext that an 
attacker cannot decrypt. We define the set Pat of patterns as an extension of 
the set of expressions, with the grammar: 

P,Q ::= patterns 

K key (for K G Keys) 

bit (for i G Bool) 



i 
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{P, Q) pair 

{P}k encryption (for K G Keys) 

□ undecrypt able 

Intuitively, a pattern is an expression that may have some parts that an attacker 

cannot decrypt. 

We define a function that, given a set of keys T and an expression M, reduces 
M to & pattern. Intuitively, this is the pattern that an attacker can see in M if 
the attacker has the keys in T. 



p{K,T) = K (for K G Keys) 

p{i,T) = i (for i G Bool) 



p{{M,N),T) = {p{M,T),p{N,T)) 



Further, we define a pattern for an expression without an auxiliary set T, but 
using the set of keys obtained from the expression itself. 



pattem(M) = p{M,{K G Keys | M h K}) 



Intuitively, this is the pattern that an attacker can see in M using the set of 
keys obtained from M. (As above, we assume that the attacker has no prior 
knowledge of the keys used in M, without loss of generality.) For example, we 
have 

pattern{{{{Ki}K2}K3,K3)) = ({□ 1 ^ 3 , K 3 ) 



Finally, we say that two expressions are equivalent if they yield the same 
pattern: 



M = N \i and only if pattern{M) = pattern{N) 



For example, we have: 



{{ATi}ifJif3, K3) = ({{0}if Jif3, K3) 

since both expressions yield the pattern ({□1/^3, K3). 

We may view keys as bound names, subject to renaming (as in the spi cal- 
culus 1 ^). For example, although {{ 0 }k,K) and {{Q}k' ,K') are not equivalent, 
we may say that they are equivalent up to renaming. More generally, we define 
equivalence up to renaming^ =, as follows: 

M = N \i and only if there exists a bijection a on Keys 
such that M = Na 



where Na is the result of applying cr as a substitution to N . Although this 
relation = is looser than =, our soundness theorem treats it smoothly, without 
difficulty. Therefore, we focus on =. In informal discussions, we often do not 
distinguish the two relations, calling them both equivalence. 
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3.3 Some Examples and Some Subtleties 

In this section we give a few more examples. Some of the examples indicate 
assumptions and choices built into the definition of equivalence. These are fairly 
subtle but important, and it is useful to be explicit about them. We revisit them 
in Section 0 

— 0 = 0, of course. 

— 0 ^ 1, of course. 

“ {0}if - 

“ ^ but (RT, {({0}iyq0)}iy) = (iC, {({l}/y/,0)}iy). 

— K ^ K' and K ^ K' , since keys are subject to renaming with = but not 
with =. 

— {0}iy = {1}k' and even {0}_r- = although the two ciphertexts are 

under different keys. 

— Similarly, {{K'}k,{^}k) = {l}/c0 and {{K'}k,{^}k) = {{K'}k, 

— {Q\k — {K\k, despite the encryption cycle in {K\k- 

— {(((i,i),(i,i)),((i,i),(i,i)))U-{ou. 

Informally, we are assuming that a plaintext of any size can be encrypted, and 
that the size of the plaintext cannot be deduced from the resulting cipher- 
text without knowledge of the corresponding decryption key. This property 
justifies equivalences such as the one above, where the two plaintexts are of 
different sizes. In an implementation, it can be guaranteed by padding plain- 
texts up to a maximum size, and truncating larger expressions or mapping 
them to some fixed string (see Section 2) . 

We could easily refine the equivalence relation to make it sensitive to sizes, for 
example by introducing a symbol for each size n. The resulting definitions 
would be heavier. 

Informally, we are assuming that an attacker who does not have a key cannot 
even detect whether two plaintexts encrypted under the key are identical. For 
example, the attacker should not be able to tell that the same plaintext ap- 
pears twice under K in ({0}iy, {0}iy), hence ({0}iy, {0}iy) = ({0}if , {l}if ). 
In an implementation, this sort of equivalence can be guaranteed by rando- 
mization of the encryption function (see Section ^ . 

We could easily refine the equivalence relation to make it sensitive to message 
identities (for example as in ^); but, again, the resulting definitions would 
be heavier. 

Informally, we are assuming that an attacker who does not have a key can- 
not even detect whether two ciphertexts use that same key. For example, 
the attacker should not be able to tell that the same key is used twice in 
hence ({0}ic, {l}ic ) = ({0}/^, {1}/^/). 

Again, an alternative definition would be possible, with some complications. 
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4 The Computational View: Encryption Schemes and 
Indistinguishability 

In this section we provide a computational treatment for symmetric encryption. 
First we describe the functions that constitute a symmetric encryption scheme, 
and then we describe when an encryption scheme should be called secure. Ac- 
tually, there are a few different possibilities for defining security, and we discuss 
several of them. The notion that we focus on — which we call type-0 security — 
is stronger than the customary notion of security (that is, semantic security, 
and notions equivalent to it 1 1 iSI7] 1 . Nonetheless, one can achieve type-0 security 
under standard complexity-theoretic assumptions. We focus on type-0 security 
because it matches up with the formal definitions of Section El Other computa- 
tional notions of security can be paired with analogous formal ones. 

4.1 Preliminaries 

Elements of an encryption scheme. Let String = {0, 1}* be the set of all finite 
strings, and let \x\ be the length of string x. Let Plaintext, Ciphertext, and Key 
be nonempty sets of finite strings. Let 0 be a particular string in Plaintext. 
Encrypting a string not in Plaintext will result in a ciphertext that decrypts 
to 0. We assume that if a; € Plaintext then x' G Plaintext for all x' of the same 
length as x. Let Key be endowed with some fixed distribution. (If Key is finite, 
the distribution on Key is the uniform one.) Let Coins be a synonym for {0, 1}“ 
(the set of infinite strings), and Parameter (the set of security parameters) be a 
synonym for 1* (the set of finite strings of I bits). 

An encryption scheme, U, is a triple of algorithms {K,£,T>), where 

/C: Parameter x Coins — > Key 

£ : Key x String x Coins — > Ciphertext 

V -. Key X String Plaintext 

and each algorithm is computable in time polynomial in the size of its input 
(but without consideration for the size of Coins input). Algorithm K. is called 
the key- generation algorithm, £ is called the encryption algorithm, and T> is 
called the decryption algorithm. We usually write the first argument to £ or T>, 
the key, as a subscript. When we omit mention of the final argument to /C or 
this indicates the corresponding probability space, or, when used as a set, the 
support of that probability space (that is, the strings which are output with 
nonzero probability). We require that for all 77 S Parameter, k G IC{ri), and 
r G Coins, if m G Plaintext then 'Dk{£k{'m,'i’)) = m, while if m ^ Plaintext then 
T>k{£k{m,r)) = 0. For example, the encryption function could treat an out-of- 
domain message as though it was 0. We insist that \£k{x)\ depends only on ry 
and |j;| when k G IC{r]). 

The definition above is for probabilistic, stateless encryption. One can be a 
bit more general, allowing the encryption algorithm to maintain state. We do 
not pursue this generalization here. 
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Other basic concepts. A function e : N — >■ R is negligible if e(ry) G This 

means that for all c > 0 there exists Nc such that e{r]) < rj~^ for all 77 > Nc- 
An ensemble (or probability ensemble) is a collection of distributions on strings, 
D = {Djf}, one for each rj. We write x-^D^j to indicate that x is sampled 
from Djf. Let D = {Dj^} and = {D'^} be ensembles. We say that D and 
are indistinguishable (or computationally indistinguishable), and write D k. D' , 
if for every probabilistic polynomial-time adversary A, the function 

e(? 7 ) Vv[x-^Drf : A{r], x) = 1] — Pr[xA£)(j : A{'q, x) = 1] 
is negligible. 

4.2 Aspects of Encryption- Scheme Security 

In this section we consider some possible attributes of encryption schemes, and 
also consider encryption cycles. These issues already appear in Section E| in a 
formal setting; here we explore them further in a computational setting. 

Attributes (present or absent) of a secure encryption scheme. We single out three 
characteristics of an encryption scheme. The first and third are well-known, while 
the second seems not to have received attention till now. 

— Repetition concealing vs. repetition revealing: 

Given ciphertexts c and (f, can one tell if their underlying plaintexts are 
equal? If so, we call the scheme repetition revealing; otherwise, it is repe- 
tition concealing. A repetition-concealing scheme must be probabilistic (or 
stateful); making encryption schemes repetition concealing is one motivation 
for probabilistic encryption m 

— Which-key concealing vs. which-key revealing: 

If one encrypts messages under various keys, can one tell which messages 
were encrypted under the same keys? If so, we call the scheme which-key 
revealing; otherwise, it is which-key concealing. Though standard instantia- 
tions of encryption schemes are which-key concealing, standard definitions 
for encryption-scheme security (like those in 1 1 !SI/j 1 do not guarantee this. 
Demanding that an encryption scheme be which-key concealing is useful in 
contexts beyond that of the present paper (for example, in achieving forms 
of anonymity). The current work of Bellare et al. undertakes a thorough 
treatment of which-key concealing encryption |S|. 

— Message-length concealing vs. message-length revealing: 

Does a ciphertext reveal the length of its underlying plaintext? If so, we call 
the scheme message-length revealing; otherwise, it is message-length concea- 
ling. Most encryption schemes are message-length revealing. The reason is 
that implementing message-length concealing encryption invariably entails 
padding messages to some maximal length, and it may therefore be quite 
inefficient. Message-length concealing encryption is possible when the mes- 
sage space is finite, or when all ciphertexts are infinite streams (rather than 
finite strings as stated in our definitions). 



14 



M. Abadi and P. Rogaway 



These three characteristics are orthogonal, and all eight combinations make 
sense. Let us call these eight notions of security type-0, type-1, . . ., type-7, with 
the numbering determined as follows: concealing corresponds to a 0 bit and re- 
vealing to a 1 bit, and we interpret the three characteristics above as a 3-bit 
binary number, the most significant bit being for repetition concealing or revea- 
ling, then which-key concealing or revealing, finally message-length concealing or 
revealing. With this terminology, the conventional concept of encryption-scheme 
security, ever since the work of Goldwasser and Micali PH], has been type-3 se- 
curity: a ciphertext may reveal the length of the message and which key is being 
used, but it should not reveal if two ciphertexts are encryptions of the same 
message. However, this concept of security is not the only reasonable one. 

Encryption cycles. Given a type-n (n € {0,...,7}) secure encryption scheme 
n — {K.,£,'D)^ one can construct a type-n secure encryption scheme U' = 
(/C,f',27') with the following property: II' would be completely insecure if the 
adversary were given (for example, as an additional input) even a single encryp- 
tion c^£'j^{k) of the underlying key k. Goldwasser and Micali were aware of this 
(in the public-key setting) when they published their work IlSI . 

It is not only encrypting k under k that is problematic; longer cycles may 
also cause problems. For example, even if an encryption scheme is type-3 secure, 
it may not be safe to encrypt a message b under a key a and then, reversing the 
roles of a and b, to encrypt a under b. For all we know, the concatenation of the 
two ciphertexts might trivially reveal both a and b. For probabilistic encryption, 
for cycles of length greater than one, we do not have any example to demonstrate 
that this problem can actually arise, but the hybrid arguments [li often used 
to prove encryption schemes secure, and which we use here, do not work in the 
presence of such cycles. 

Therefore, as discussed in Section 0, we focus on expressions without encryp- 
tion cycles. In return, we can rely on standard-looking definitions and tools in 
the computational setting. 



4.3 Definitions of Encrypt ion- Scheme Security (Types 0, 1, 3) 

The formal treatment in Section 0 corresponds to type-0 security (repetition con- 
cealing, which-key concealing, and message-length concealing), so let us define 
this notion more precisely. An explanation of the notation follows the definition. 

Definition 1 (Type-0 security). Let II = {K.,£,T>) be an encryption scheme, 
let rj £ Parameter be a security parameter, and let A be an adversary. Define 



Adv 



77 [77] 



(A) = Pr k,k'^K.{p): (^) = 1 



Pr 



k^K{rf) : £Ao) ^ 



Encryption scheme II is type-0 secure if for every probabilistic polynomial-time 
adversary A, Adv57[^j(A) is negligible (as a function ofrj). 
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We are looking at the difference of two probabilities. 

— First, let us focus on the first probability. The quantity in brackets describes 
an experiment that is performed, and then an event. In this experiment, 
one first chooses two keys, k and k', independently, by running the key- 
generation algorithm /C. Then one runs adversary A, with two oracles: a left 
oracle / and a right oracle g. If the adversary asks the left oracle / a query 
m G String, the oracle returns a random encryption of m under key k. That 
is, the oracle computes c-^ £k (m) and returns c. If the adversary asks the 
right oracle g a query m G String, the oracle returns a random encryption 
of m under key k' , similarly. Independent coins are used each time a string 
is encrypted (but the keys k and k' stay fixed). 

— Next, let us consider the second probability. In this experiment, a single 
key k is selected by running the key-generation algorithm /C. The adversary 
again has two oracles, a left oracle / and a right oracle g, and these oracles 
again expect queries m G String. But now the oracles behave in the same 
way. When asked a query to, the oracles ignore the query, sample c A £1^(0), 
and return c. Independent coins are used each time a string is encrypted 
(but the key k stays fixed). 

The type-0 advantage is the difference in the above probabilities. One can imagine 
that the adversary is trying to distinguish a good encryption box from a false 
one. A good encryption box encrypts the specified query using the selected key. 
A false encryption box ignores the query and encrypts a fixed message under a 
fixed random key. Intuitively, a scheme is type-0 secure if no reasonable adversary 
can do a good job at telling apart the two encryption boxes on the basis of their 
input /output behavior. 

Various other equivalent formalizations for type-0 encryption are possible. 
For example, it adds no power for there to be more than two oracles. (In the 
first experiment, each oracle would encrypt queries under its own key; in the 
second, every oracle would encrypt 0 under a common key.) Likewise, it takes 
away no power if £k'(‘) is replaced with £k'(0) in the first experiment. 

We also give detailed definitions of type-1 and type-3 security; they re- 
semble that of type-0 security. In these definitions, £k(-) is an oracle that re- 
turns c-^ £k(m) on input to, as above, and £ifc(0l'l) is an oracle that returns 
cA£ifc(0l’”l) on input to. 



Definition 2 (Type-1 security). Let U = (JC,£,'D) be an encryption scheme, 
let rj G Parameter be a security parameter, and let A be an adversary. Define 



Adv)j[,,](A) = Pr k,k'^K.{'n) : (^) = i 



Pr 



k^lCig) : ^ 



Encryption scheme II is type-1 secure if for every probabilistic polynomial-time 
adversary A, Adv)jj^](A) is negligible (as a function ofrj). 
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Definition 3 (Type-3 security). Let U = {1C,E^T>) be an eneryption scheme, 
let rj G Parameter be a security parameter, and let A be an adversary. Define 



Adv’ 



n[rj] 



(A) = Pr k^JC{r]) : (r?) = 1 



Pr 



k^lCip) : {t]) = 1 



Encryption scheme II is type-3 secure if for every probabilistic polynomial-time 
adversary A, Adv|jj^](A) is negligible (as a function ofrj). 



4.4 Achieving Type-0 and Type-1 Security With Standard Tools 

Since type-3 security is standard but type-0 and type-1 security are not, we show 
that type-0 and type-1 security can be achieved using standard assumptions and 
constructions. Although this fact is not necessary for our soundness theorem, it 
provides support for the hypotheses of the theorem. 



Block ciphers. Let /3 > 1 be a number (the blocksize) and let Block = {0, 1}^. Let 
Key be a finite nonempty set. Then a block cipher is a function E : Key x Block — )> 
Block such that, for every k G Key, we have that Ek{-) = E{k,-) is a, permutation. 
Example block ciphers are DES and the emerging AES (Advanced Encryption 
Standard) . 

One measure of security for a block cipher is: 



Advg'P(A) = Pr Key : = 1 



-Pr 



7r^Perm(/3) : = 1 



Here Perm(/3) denotes the set of all permutations on {0, 1}^. Informally, the 
adversary A is trying to distinguish the block cipher E, as it behaves on a random 
key k, from a random permutation tt. We think of if as a good block cipher if 
Adv^'^^(A) is small as long as A is of reasonable computational complexity. 



Block cipher modes of operation. Block ciphers are the most common building 
block for making symmetric encryption schemes. Two well-known ways to do 
this are CBC mode and CTR mode. In CBC mode (with a random initialization 
vector), the encryption of a plaintext x = x\. . .Xn using key k G Key, where 
n > 1 and \xi\ = {0, l}'^, is ?/o2/i ■ • ■ 2/n where yo A Block and y^ = Ek{yi-i © Xi) 
for all 1 < * < n. In CTR mode, the encryption of a plaintext x using key k 
is the concatenation of r A Block with the xor of x and the |x|-bit prefix of the 

concatenation of Ek{r), Ek{r-\-l), Ek{r-\-2), Here r-\-i is the /3-bit string that 

encodes the sum of r (treated as an unsigned number) and i, modulo 2^. In 0, 
Bellare et al. establish the (type-3) security of these two modes of operation. 
Their results are quantitative, measuring how well one can attack the block 
cipher E in terms of how well one can attack the given encryption schemes 
based on E (in the sense of type-3 security). 
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CBC and CTR modes are which-key coneealing. Even though the results just 
mentioned do not indicate that CBC mode or CTR mode are which-key con- 
cealing, these schemes are in fact which-key concealing and those results can be 
used to show it, as we now sketch. Let U — (1C,£,'D) be an encryption scheme, 
let A be an adversary, and define 



= Pr 
Pr 



fcAKey(? 7 ) : (rj) = 1 

k A Key(ry) : A*' ( 77 ) = 1 



By we denote an oracle which, on input m, computes c-^£k(jn) and re- 

turns a random string of length |c|. (By an assumption stated above, |c| depends 
only on t] and \m\.) Informally, the adversary cannot tell if it is given a real 
encryption oracle or an oracle that returns a random string (of the appropriate 
length) in response to every query. 

The proofs of security in (Z] actually establish that CBC mode and CTR 
mode are good schemes according to assuming that the underlying 

block cipher E is secure according to Adv^^^^. To complete the picture, we claim 
that any good scheme according to Adv'^'^"'^ is also type-1 secure. (This claim is 
not hard to prove, though we omit doing so here.) Therefore, CBC mode and 
CTR mode (as defined above) are type-1 secure: repetition concealing, which-key 
concealing, but message-length revealing. 



Hiding message lengths for type-0 security. Finally, we have to conceal message 
lengths. This step is standard, provided the message space is finite. Let II = 
(IC,£,I)) be a type-1 secure encryption scheme with Plaintext = {0,1}*. Let 
Plaintext^ C String be a finite set, with a particular element O'. To make a type-0 
secure encryption scheme we just encode all messages of Plaintext^ into strings 
of some fixed length, and then encrypt these using £. That is, we choose any 
convenient function encode(-) which (reversibly) takes strings in Plaintext' to a 
subset of {0, 1} , for some number £. The encryption scheme H' = {1C,£',V'), 
with message space Plaintext', is defined by letting £^.{m) = £k{encode{m)) for 
m G Plaintext', setting f{(m) = £{(0') for m ^ Plaintext', and defining T>' in the 
obvious way. Type-1 security of H immediately implies type-0 security of 77'. 



5 The Computational Soundness of Formal Equivalence 

In this section we relate the two views of cryptography. We proceed in two 
steps. First, we show how to associate an ensemble to an expression M, given 
an encryption scheme 77. Then we show that, under appropriate assumptions, 
equivalent expressions give rise to indistinguishable ensembles. 

5.1 Associating an Ensemble to an Expression 

Let 77 = (IC,£,I)) be an encryption scheme and let f] G Parameter be a secu- 
rity parameter. We associate to each formal expression M G Exp a distribution 
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algorithm Initialize^ (M) 

for K e Keys{M) do r(A') A/C(?7) 

algorithm Convert(M) 

if M = K where K € Keys then 
return {t(K), “key” ) 
if M = fe where b £ Bool then 
return ( 6, “bool” ) 
if M = (Ml, M2) then 

return ( Convert(Mi), Convert(M 2), “pair” ) 
if M = {Mi}k then 
X A Convert(Mi) 

y^Sr(K){x) 

return ( y, “ciphertext” ) 



Fig. 1 . How to map (probabilistically) an expression M to a string Convert(M), 
given an encryption scheme H — (IC,£,'D) and a security parameter p. 



on strings and thereby an ensemble This association constitu- 

tes a concrete semantics for expressions (in the style of programming-language 
semantics or logic semantics); it works as follows: 

— First, we map each key symbol K that occurs in M to a string of bits t(K), 
using the key generator IC(ri). 

— We map the formal bits 0 and 1 to standard string representations for them. 

— We obtain the image of a formal pair (M, N) by concatenating the images 
of the components M and N. 

— We obtain the image of a formal encryption {M}k by calculating 
where x is the image of M . 

— In all cases, we tag string representations with their types (that is, “key”, 
“bool”, “pair”, “ciphertext”) in order to avoid any ambiguities. 

This association is defined more precisely in Figure ^ In the figure, we 
write Keys(M) for the set of all key symbols that occur in M, and write 
{x\, . . . ,Xk) for an ordinary string encoding of xi, . . . , Xk- The auxiliary in- 
itialization procedure Initialize,j(M) maps every key symbol in Keys(M) to a 
unique key t(K). The probability of a string in is that induced by the 

algorithm Convert(M) of Figure ^ 

5.2 Equivalence Implies Indistinguishability 

Our theorem is that equivalent expressions correspond to indistinguishable en- 
sembles, assuming that the expressions are acyclic and that the underlying en- 
cryption scheme is type-0 secure. 

Theorem 1. Let M and N be aeyclic expressions and let II he a type-0 secure 
encryption scheme. Suppose that M = N . Then k. |iV]yj. 
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The proof of this theorem is a hybrid argument, as in mm- One must be 
particularly careful in forming the hybrids, relying on acyclicity. Because of the 
generality of the claim, the description of the hybrid argument is somewhat 
complex and long. Therefore, we omit the proof in the present version of this 
paper. We only give a few simple examples instantiating the claim that M = N 
implies IMjjj ~ [iV]^: 

— Since 0 = 0, we conclude that |0]yj ~ [01 77 - ^he two ensembles being com- 
pared put all the probablity mass on a single point, (0, “bool” ). 

— Since K = K', we conclude that The two ensembles being 

compared are identical: they are induced by the key generator /C. 

— Since {0}iy = {l}if, we conclude that [{0}iclyj ~ indistin- 

guishability is nontrivial: it relies on the assumption that the encryption 
scheme is type-0 secure. 

— Although {0}iy = {K}x, we cannot conclude anything about how [{0}iy|^ 
may relate to [{AT}iy]^, because of the encryption cycle in {K}k- 

Reconsidering some of the other examples of Section 10 can also be instructive. 

One may wonder whether a converse to this theorem holds, that is, whether 
indistinguishability implies equivalence. This converse fails, for fairly trivial rea- 
sons: if (0, “bool” ), ( 1, “bool” ) ^ Plaintext, then the same ensemble is associa- 
ted with the expressions {K,{0}k) and (AT, {1 }tc), but these expressions are 
not equivalent. We have not explored in detail whether the converse holds when 
Plaintext is large enough. 



6 Conclusions 

The formal approach to cryptography often deals with simple, all-or-nothing 
assertions about security. The computational approach, on the other hand, ma- 
kes a delicate use of probability and computational complexity. However, one 
may intuit that the formal assertions are valid in computational models, if not 
absolutely at least with high probability and against adversaries of limited com- 
putational power. In this paper, we develop this intuition, applying it to the 
study of encryption. We prove that the intuition is correct under substantial 
but reasonable hypotheses. This study of encryption is a step — perhaps modest 
but hopefully suggestive — toward treating security protocols and complete sy- 
stems, and toward combining the sophistication of computational models with 
the simplicity and power of formal reasoning. 
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Molecular computing is a research area which aims to explore the potential of 
computation by molecules. Although the idea dates back to Feynman, it became 
realistic only when Adleman succeeded in solving a Hamiltonian path problem 
using DNA. The research community for the investigation and development of 
DNA Based Computers was then quickly formed by groups in the United States, 
Europe, Japan, etc. 

The Japanese Molecular Computer Project, Theory and Construction of 
Molecular Computers, began in 1996 and will end in 2001. This project has 
produced a large number of experimental results, which verify the feasibility of 
basic operations for molecular computations and will help design a large scale 
molecular computer in the near future. In this, the last year of the project, a 
medium-scale DNA computer in which most reactions are executed by robot 
hands is under construction. A large number of theoretical results have also 
been produced by the project. Such theoretical studies either followed previous 
experimental results or initiated further experimental investigations. 

I strongly believe that molecular computing should aim to explore the com- 
putational power inherent in molecular reactions. It should not be restricted to 
solving combinatorial problems by means of massive parallelism. In particular, 
a deep understanding of the computational power of molecules can be applied 
to many areas of information technology, biotechnology, and nanotechnology. 

In this talk, after summarizing the achivements of our molecular computer 
project, I will suggest a set of perspectives to place the theory of molecular 
computing. This set can be roughly classified as follows. 

— Studies which seek to propose or develop computational models suitable 
for describing molecular reactions. These include recent membrane models, 
which incorporate cell structure, in addition to simple molecular reactions. 

— Studies on the computability of such computational models. In particular, 
achieving universal computability has been the major research interest in 
these studies. These include already classic ideas to build Turing machines 
using DNA. 

— Studies on the complexity of such computational models. The trade-off bet- 
ween the number of computational steps and the amount of molecules neces- 
sary is a typical research issue. Note that molecular reactions should always 
be analyzed as parallel computations. 

In addition to the above main stream of research, there are a large number of 
studies for analyzing the overall fidelity and efficiency of molecular computations. 

J. van Leeuwen et al. (Eds.): IFIP TCS2000, LNCS 1872, pp. 23-^^ 2000. 
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Related studies on encoding performance and design are also very active and 
important for reducing error and enhancing efficiency. Although the design of 
good encodings can be formulated as a combinatorial problem, it has recently 
been recognized that a thermodynamical treatment is essential for analyzing the 
error mechanisms inherent in molecular reactions. 

According to statistical thermodynamics, the behavior of molecular reactions, 
and therefore that of molecular computations, is inherently probabilistic, even 
if the possibility of error is ignored. I believe that it is appropriate to analyze 
molecular computations as probabilistic processes, taking into account the phy- 
sical properties of molecular reactions. Studies on the complexity of molecular 
computations should therefore include probabilistic analyses based on statistical 
thermodynamics. Such physical analyses will deepen our understanding of mole- 
cular reactions and will be applied to many areas as fundamental knowledge. 
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Abstract. Over the years coding theory and complexity theory have 
benefited from a number of mutually enriching connections. This article 
focuses on a new connection that has emerged between the two topics 
in the recent years. This connection is centered around the notion of 
“list-decoding” for error-correcting codes. In this survey we describe the 
list-decoding problem, the algorithms that have been developed, and a 
diverse collection of applications within complexity theory. 



1 Introduction 

The areas of coding theory and complexity theory have had a long and sustained 
history of interesting connections. Early work on computation in the presence of 
noise built on these connections. Recent successes of complexity theory, showing 
IP=PSPACE and giving PCP characterizations of NP have relied on connections 
with coding theory either implicitly or explicitly. The survey article of Feigen- 
baum m gives a detailed account of many connections and consequences. 

Over the last few years a new strain of connections has emerged between 
coding theory and complexity theory. These connections are different from the 
previous ones in that they rely especially on the qualitative strength of the deco- 
ding algorithms; and in particular on the ability to recover from large amounts 
of noise. The first work in this vein seems to be that of Goldreich and Levin P2|, 
whose work describes (implicitly) an error-correcting code and gives a highly effi- 
cient algorithm to decode the code from even the slightest non-trivial amount of 
information. They use the algorithm to give a generic construction of hard-core 
predicates from an arbitrary one-way function. Subsequently, there have been a 
number of other such results providing even more powerful decoding algorithms 
and deriving other applications from these algorithms to complexity theory. 

The main theme common to these works is the application of a new notion for 
decoding of error-correcting codes called list decoding. List decoding formalizes 
the notion of error-correction, when the number of errors is potentially very large. 

* This survey is a fuller version of a previous one by the author that appeared in 
SIGACT NEWS, Volume 31, Number 1, pp. 16-27, March 2000. 

** Supported in part by a Sloan Foundation Fellowship and NSF Career Award CCR- 
9875511. 
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Borrowing the terminology from the area of information communication, recall 
that to transmit information over a noisy channel, the transmitter transmits a 
codeword of an error-correcting code. This transmitted word is corrupted by the 
noisy channel, and the receiver gets some corrupted word that we will call “the 
received word.” If the number of errors that occur during transmission is very 
large, then the received word may actually be closer to some codeword other 
than the transmitted one. Under the mandate of list-decoding, the receiver is 
required to compile a list of all codewords within a reasonable sized Hamming 
ball around the received word (and not just the nearest one). The list-decoding 
is declared to be successful if this list includes the transmitted word. 

This notion of list decoding was proposed by Elias jS| in the 1950’s. Ho- 
wever till recently no non-triviafl list decoding algorithms were known for any 
error-correcting code. Of late, we have seen a spurt of efficient list-decoding al- 
gorithms; and equally interestingly, a diverse collection of applications of these 
list-decoders to complexity theoretic problems. In this survey, we describe some 
of these results. First we start with some definitions. 

2 Error-Correcting Codes and List-Decoding 

A block error-correcting code C is a collection of strings called codewords, all 
of which have the same length, over some finite alphabet E. The three basic 
parameters describing the code are the size of the alphabet, denoted q; the 
length of the codewords n; and an information parameter k, where the number 
of codewords is Such a code is succinctly referred to as an (n, k)q code0. If E 
has a field structure imposed on it, then A" may be viewed as a vector space. 
If additionally C forms a linear subspace of A", then C is termed a linear code 
and denoted an [n, k]q cod^l. Almost all codes dealt with in this article will be 
linear codes. 

In order to ensure that the code helps in the recovery from errors, one designs 
codes in which any two codewords differ from each other in large number of 
locations. Formally, let the Hamming distance between strings x and y from A", 
denoted A{x,y), be the number of coordinates where x and y differ from each 
other. The the distance of a code C, typically denoted d{C), is the minimum, 
over all pairs of non-identical codewords in C, of the distance between the pair. 

One of the first observations that can be made about a code C with distance 
d is that it can unambiguously correct errors, i.e., given any word r S A", 
there exists at most one codeword c G C such that Z\(r, c) < It is also easy to 
find a word r such there exist two codewords at distance from it, so one can 

not improve the error bound for unambiguous decoding. However it was realized 

^ Here triviality is used to rule out both brute-force search algorithms and the unique 
decoding algorithms. 

^ Sometimes in the literature this would be referred to as an (n, q’°)g code, where the 
second parameter counts the number of messages as opposed to say the “length” of 
the message. 

^ Note the subtle change in the notation. 
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early on that unambiguous decoding is not the only useful notion of recovery 
from error. Elias Pj proposed the notion of list decoding in which a decoding 
algorithm is expected to output a list of all codewords within a given distance e 
from a received word r G 17". If the list of words output is relatively small, then 
one could consider this to be a reasonable recovery from error. Algorithmically, 
this problem is stated as follows: 

Definition 1 (List decoding problem for a code C). 

Input: Received word r G i7", error bound e. 

Output: A list of all codewords ci, . . . ,Cm G C that differ from r in at most e 
places. 

As usual, the goal is to solve the list decoding problem efficiently: i.e., in time 
polynomial in n. However this is only possible if the output size is polynomially 
bounded in n. This motivates the following, purely combinatorial, question. 

Definition 2 (List decoding problem: Combinatorial version). 

For every c, determine the function Cc{n, k, d, q) such that for every (n, k)q code 
C of distance d{C) = d, and for every received word r G i7”, there are at most 
{qnY codewords in the Hamming hall of radius e around r. 

We would like to study the asymptotic growth of Cc when we say fix the ratio 
k/n and d/n and let n — > oo. If it makes sense, we would then like to study 
Coo, the limit of Cc as c — > oo. It turns out that e^o is fairly well-understood 
and this will be described in Section 0 Somewhat coincidentally, for a variety 
of codes, the list decoding problem can be solved in polynomial time provided 
e < (1 — o(l)) • Coo- These results will be described in Section 0 

Before concluding this section, we present one more version of the algorith- 
mic list-decoding problem that has been studied in the literature. This version 
is motivated by the question: Are there sub-linear time list-decoding algorithms 
for any error-correcting code? At first glance, linear time in n appears to be a 
lower bound on the running time since that is the amount of time it takes to 
read the input, or even the time to output one codeword. However, by specifying 
the input implicitly and allowing the output also to be specified implicitly, one is 
no longer subject to these trivial lower bounds on computation time. The notion 
of implicit representation of the input can be formalized by using an “oracle” to 
represent the input — when queried with an index i, the oracle responds with the 
ith bit of the received word. The notion of implicit representation of the output is 
somewhat more involved. Roughly we would like each element of the output list 
to be described succinctly by an efficient program that computes any one coor- 
dinate of the codeword. However these programs are allowed to be randomized; 
furthermore, they are allowed to make oracle calls to the implicit input when 
attempting to compute any one coordinate of the output. The notion of impli- 
cit representation of an output (codeword) is thus formalized by the concept of 
“probabilistic oracle machines,” machines that are allowed to make oracle calls 
(to the received word). Under this formalism, the list decoding problem may 
now be rephrased as: 
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Definition 3 (List decoding problem: Implicit version). 

Implicit Input; Oracle access to received word r : {1 . . . , n} — >■ i7, error hound 
e. 

Output; A list of all codewords c\, . . . ,Cm € C, represented implicitly by proba- 
bilistic oracle machines Mi, . . . ,Mm working with oracle access to r, that differ 
from r in at most e places. For every i G {1, . . . , m} and j G {1, . . . , n}, Mi 
satisfies the property that Fr[M^^\j) = Ci{j)] > 

We remark that these implicit representations have now become common 
and useful in the theory of computation (e.g., in works on program testing/self- 
correcting, PCPs etc.). They allow for more modular use of algorithmic ideas; 
and results expressed in these terms deserve attention. It turns out that for 
the list decoding problem highly efficient solutions exist in this model for some 
codes — essentially in time polynomial in logn. This efficiency translates into 
some very useful applications in complexity, and this will be described in the 
forthcoming sections. 

3 Status of the Combinatorial Problem 

We first sketch the status of the combinatorial problem described above. It is 
easily seen that eoo{n,k,d,q) > eo{n,k,d,q) = (the unambiguous error- 
correction radius). Also, if the parameters are such that an (n,k)q code with 
distance d does exist, then it is also possible to get one along with a received 
word that has exponential in d codewords at distance d from it. Informally, 
this suggests eoo{n,k,d,q) < d (though to be formal, we should first let n go 
to infinity, and then let c go to infinity!). Thus it seems reasonable to believe 
that Cc may be of the form ad, where a is some universal constant between 
1/2 and 1, and possibly a function of c. Unfortunately, the answer is not so 
simple; Cc turns out to be a function also of n and q and surprisingly is not very 
dependent on c. Roughly, (if q is very large), then Cc ~ n — i/n{n — d). (Even 
the task of performing a sanity check on this expression, i.e., to verify that d/2 < 
n — a/ n(n — d) < d, takes a few moments!) Some insight into this expression; 
If d = o(n), then n — sjn(n — d) is well approximated by d/2. However, if d 
is large, i.e., d = n — o{n), then the bound on e is also n — o{n) and so e is 
well-approximated by d. We conclude that in the former case, the list-decoding 
radius is limited by the “half the distance” barrier, while in the latter case, it is 
not so limited. 

The following theorem essentially refines the above expression to take into 
account small values of q. Recall that the “Plotkin bound” of coding theory shows 
that error-correcting codes with d>{l-l/q)-n have only polynomially many 
codewords and hence are not very interesting. Thus it makes sense to compare 
d and e as fractions of n' rather than n. The theorem statement replaces all 
occurrences of n in the expression above by n' = (1 — l/g) • n. 
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Theorem 4. 



1. Let n,k,d,q,e satisfy d < n' and e < 




1 - 4 

r> ' 



■ n' where v! = 



• n. Then, for every (n, k)q code C with d{C) > d and for every 

received word r, there are at most qn'^ codewords within a Hamming distance 
of e from r. 

2. For every n, d, q, e, e such that e > 0, d < n' and e> (l + e)-( 1— yl — ^ )• 



n' where n' = ~ ^ j ‘n, there exists a (non-linear) (n, k)q code C of distance 

at least d and a received word r, such that there are exponentially many 
codewords (with the exponent growing with en) within a Hamming distance 
of e from r. 



Note: The theorem above appears explicitly in H3|. The crucial direction, Part 
(1) above, is a 9 -ary extension of the “Johnson bound” in coding theory. Johnson 
proves this bound only for the binary case, but the extension to the 9 -ary case 
seems to be implicitly known to the coding theory community I2Z1 Chapter 4, 
page 301]. 

Proof [Sketch]: 

1. (Following a proof of Gurus wami and Sudan m-) Fix the 9 -ary alphabet 
S, the received word r G if" and let ci , . . . , be codewords within a 
Hamming distance of e from r. Let e denote the average distance (averaged 
over i) of Ci from r. Note e < e. The main steps in the proof are (1) Associate 
with J7, 9 orthonormal vectors in 9 -dimensional real space. Without loss of 
generality these maybe the coordinate vectors. (2) Use this association to 
embed the vectors r and ci, . . . , Cm in . (3) Pick i and j in {1, . . . , m} 
at random (with replacement) and consider the expectation of the inner 
product {ci — r, cj — r) . Since Ci and Cj are close to r and further are not very 
close to each other, this inner product is small in expectation. Specifically 
the expected value is at most 2e — d-l-^. On the other hand the vectors Ci~r 
are not small (they are non-zero in an average of 2 e locations) and have non- 
negative inner product in each coordinate. Some elementary manipulation 
(which involved studying the location of the non-zero coordinates, their signs 

and an application somewhere of the Cauchy-Schwartz inequality) shows that 

-2 

the expected inner product is at least This yields the inequality 



96 ^ , d 

— < 2 e — d H 

(9 — l)n m 



which, in turn, yields the bound in Part (1) of the Theorem. 

2. (Following Goldreich et al.^3]) Let r be the all zeroes vector. Pick ci, . . . ,Cm 
independently as follows: In each coordinate Cj is chosen randomly (and 
independently of all else) to be 0 with probability 1 — ^ and chosen to be a 
random non-zero element of S otherwise. The probability that there exists 
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a pair within distance d is easily bounded from above by m^exp(— en). Thus 
it is possible to pick exponentially many codewords that are mutually at 
distance at least d while with high probability are within distance e to the 
received vector. 



I 

Theorem 2] yields an asymptotically tight result on e^o- To study this asym- 
ptotic limit, let us minimize some of the parameters above. First notice that 
k does not play any role in Part (1) of the theorem. So let ec{n, ■, d, q) denote 
the minimum over k of edn, k, d, q). Now further fix d = <5 • (1 — ^) • n and let 
q — q{n) be any function of n. Now let 



£c(d) 



lim 

n—^oo 






Let eoo(d) = limc_).oo £c- Then Theorem 0 above can be summarized as: 



Corollary 5. For S G [0, 1], €2 = eoo(d) = 1 - Vl - S. 



In the next section, we will describe algorithmic results which come close to 
matching the combinatorial results above. 



4 Specific Codes and Performance of List-Decoding 
Algorithms 

We start by introducing the reader to a list of commonly used (and some not so 
commonly used) error-correcting codes. In the first five codes below, q will be 
assumed to be a prime power, and S will be a finite field on q elements. 

Hadamard codes. For any k, the Hadamard code FLk is a (n = q^,k)q code 
with distance (1 — ^)q^, obtained as follows: The message is a, k dimensional 
vector over S, denoted a. The codeword is indexed by space of fc-dimensional 
vectors. The /3-th symbol in the encoding of a is their inner product, that is 

■ A- 

(Generalized) Reed Solomon codes. Here the message is thought of as spe- 
cifying polynomial of degree at most fc — 1 by giving its k coefficients. The enco- 
ding evaluates the polynomial at n distinct points in the finite field. (It follows 
that q has to be at least n.) The fact that two distinct degree fc — 1 polynomials 
may agree on at most fc — 1 points yields that the distance is at least n — k + 1. 
The fact that there do exist distinct degree k — 1 polynomials that agree at any 
given subset of fc — 1 places shows that the distance is exactly n — fc -I- 1. 

Reed Muller codes. Reed Muller codes may be viewed as a common gene- 
ralization of Reed Solomon codes and Hadamard codes. For parameters m and 
I < q, the Reed Muller code has fc = and n = q™‘. The message is viewed 

as specifying a polynomial of total degree at most I over m variables. The enco- 
ding gives the evaluation of this polynomial at every possible input. For I < q, 
the codewords are at a distance of at least (1 — l/q)n from each other. 
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Algebraic geometric codes. These codes are also generalizations of the ge- 
neralized Reed Solomon codes. Description of the construction of these codes is 
out of scope. All we will say is that they yield (n, k)g codes with distance at least 
d = n — k — n/ — 1) when q is a square. In contrast a random linear code 
of dimension k has distance approximately n — k — n/logq. Thus the algebraic 
geometric codes asymptotically beat the distance achieved by the random code, 
provided q is large enough! 

Concatenated codes. This term refers to any code obtained by a certain pro- 
cess, called concatenation of codes, that derives a new code from two given codes. 
Specifically, given an “outer” code over a q^-ary alphabet and an ’’inner” code 
of dimension k over a q-ary alphabet, the concatenated codeword corresponding 
to a given message is obtained by first encoding the message using the outer 
code, and then encoding each symbol of the resulting string by the inner code. 
If the outer code is an {ni,ki)gk 2 code and the inner code is an {ri 2 ,k 2 )q code, 
then the concatenated code is an (niri 2 , kik 2 )q code. If d\ is the distance of the 
outer code and d 2 is the distance of the inner code, then the concatenated code 
has minimum distance d\d 2 - In this section we will consider codes obtained by 
concatenating a Reed-Solomon, Reed-Muller or Algebraic-Geometry code as the 
outer code with a Hadamard code as the inner code. 

Chinese remainder codes. These codes are an aberration in the class of codes 
we consider in that they are not defined over any single alphabet. Rather the 
i-th symbol is from an alphabet of size pi, where pi, ... ,pn are n distinct primes 
arranged in increasing order. The messages of this code are integers between 
0 and A — 1, where K = 0^=1 Pi- The encoding of a message is the n-tuple 
of its residues modulo pi, . . . ,Pn. In the coding theory literature, this code is 
often referred to as the Redundant Residue Number System code. As an easy 
consequence of the Chinese Remainder Theorem, we have that the message can 
be inferred given any k of the n residues making this a code of distance n — k+1. 
If Pi Ki p^ Ki p, then one may view this code as approximately an (n, k)p code. 



4.1 List Decoding Results: Explicit Version 

For some families of codes, it is possible to get algorithms that perform list- 
decoding in polynomial time for e, the number of errors, as large as the bound 
in Theorem 0 The following theorem lists this family of results. 

Theorem 6. Let C be an (n, k)q eode with designed distane^ d = 5n' , where 
n' = (1 — l/q)n. Further, if C is either a (1) Hadamard eode, (2) Reed-Solomon 
eode, (3) Algebraic- geometric code, (4) Reed-Solomon or algebraic- geometry code 

In at least two cases, that of algebraic-geometry codes and algebraic-geometry codes 
concatenated with Hadamard code, the code designed to have distance d, may turn 
out to have larger minimum distance. Typical decoding algorithms are unable to 
exploit this extra bonus, and work only against the designed distance of the code; in 
fact, there may be no short proof of the fact that the code has this larger minimum 
distance. This explains the term “designed distance” of a code. 
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concatenated with a Hadamard code, or (5) Chinese remainder code, then it has 
a polynomial time list decoding algorithm that decodes from e < (1 — Vl — 6)n' 
errors. 

Proofs of any of these results is out of scope. We will simply give some pointers 
here. 

Remarks: 

1. Note that the result for Hadamard codes is trivial, since this code has only 
n codewords. Thus a brute force search algorithm that lists all codewords 
and then evaluates their distance against the received word to prune this 
list, runs in time O(n^). 

2. An algorithm for list-decoding the Reed-Solomon codes when e < n — 
\/2n(n — d) was given by Suda.njTTj based on earlier work of Ar, Lipton, 
Rubinfeld, and Sudan |2|. For the case of explicit list-decoding problem this 
was the first non-trivial list-decoder that was constructed for any code. The 
tight result above is from the work of Guruswami and Sudan ini- 

3. The first list-decoder for algebraic-geometry codes was given by Shokrollahi 
and Wasserman [2S1. There error bound matched that of The tight 
result above is again from inj. 

4. It is easy to combine non-trivial list-decoders for the outer code and inner 
code to get some non-trivial list decoding of a concatenated code. However, 
such results will not obtain the tight result described above. The tight result 
above is due to Guruswami and Sudan m- 

5. A list decoder correcting n — errors for the Ghinese remain- 

der codes was given by Goldreich, Ron, and Sudan m- Boneh 0 recently 
improved this bound to correct n — errors. Even more recently 

Guruswami, Sahai, and Sudan uni improve this to correct n — '/nk errors. 



4.2 List Decoding Results: Implicit Version 

For the implicit list-decoding problem, some fairly strong results are known for 
the cases of Hadamard codes, Reed-Muller codes and consequently for concate- 
nated codes. We describe these results in the next two theorems. For the case of 
binary Hadamard codes, Goldreich and Levin m gave a list-decoding algorithm 
when the received word is specified implicitly. They consider the case when the 
number of errors is arbitrarily close to the limit of Theorem ^ and in particular 
takes the form e = (| — 7 ) 71 . They give a randomized algorithm whose running 
time is poly (log u, and reconstructs explicitly a list of all messages (note that 
message lengths are 0(log n) for the Hadamard code) that come within this er- 
ror of the received word. Subsequently, this algorithm was generalized to general 
q-aiy Hadamard codes by Goldreich, Rubinfeld, and Sudan m- This yields the 
following theorem. 
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Theorem 7. There exists a probabilistic list decoding algorithm in the implicit 
input model for Hadamard codes that behaves as follows: For an (n, k)q code C, 
given oracle access to a received word r, the algorithm outputs a list that includes 
all messages that lie within a distance of e from the received word. The running 
time of the algorithm is a polynomial in log n, log q and — ^ — . 

9-1 ® 



For the case of Reed-Muller, equally strong list-decoding results are known, 
now with the output representation also being implicit. Arora and Sudan jSj 
provided such a list-decoder provided the error bound satisfies e < n(l — (1 — 
d/nY), for some positive e. Sudan, Trevisan, and Vadhan m improved this 
bound to a tighter bound of e < (1 — \/l — d/n)n, thus yielding the following 
theorem. 



Theorem 8. There exists a probabilistic list decoding algorithm in the implicit 
input and implicit output model for Reed-Muller codes that behaves as follows: 
For a {n = g™, k = Reed-Muller code C, given oracle access to a received 

word r, the algorithm outputs a list of randomized oracle programs that includes 
one program for each codeword that lies within a distance of e from the received 
word, provided e < (1 — 0{\Jl/ q))n. The running time of the algorithm is a 
polynomial in to, I and \ogn. 

As pointed out earlier, it is easy combine list-decoding algorithms for outer 
and inner codes to get a list-decoding algorithm for a concatenated code. By 
concatenating a Reed-Muller code with some appropriately chosen Hadamard 
code, one also obtains the following result, which turns out to be a handy result 
for many applications. In fact, all results of Section 0use only the following 
theorem. 

Theorem 9. For every q, e and k, if n > poly(fc, q, i) there exists an (n, k)g 
code with a polynomial time list-decoding algorithm for errors up to {l — l/q—e)n. 
Furthermore, the algorithm runs in time polynomial in logk and \ jt if the input 
and output are specified implicitly. 

The above result, specialized to g = 2 is described explicitly in im- The 
general codes and list-decoding algorithm can be inferred from their proof. 



5 Applications in Complexity Theory 

Algorithms for list-decoding have played a central role in a variety of results in 
complexity theory. Here we enumerate some (all?) of them. 

Hardcore predicates from one-way permutations. A classical question 
lying at the very foundations of cryptography is the task of extracting one hard 
Boolean function (predicate), given any hard one-way function. Specifically given 
a function / : {0,1}^ — >■ {0,1}^ that is easy to compute but hard to invert, 
obtain a predicate P : {0, 1}^ — >■ {0, 1} such that P{x) is hard to predict given 
f{x). Blum and Micali 0 showed how it was possible to extract one such hard 
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predicate from the Discrete Log function and used it to construct pseudo-random 
generators. A natural question raised is whether this ability to extract hard 
predicates is special to the Discrete Log function, and if not, could such a hard 
predicate be extracted from every one-way function /. 

At first glance this seems impossible. In fact, given any predicate P, it is pos- 
sible to construct one-way functions / such that /(x) immediately gives P{x). 
However, this limitation is inherited from the deterministic nature of P. Gold- 
reich and Levin modify the setting to allow the predicate P to be rando- 
mized. Specifically, they allow the predicate P to be a function of x and an 
auxiliary random string r. P is considered hardcore for / if P{x,r) is hard to 
predict with accuracy better than ^ -f e given /(x) and r. (The function P is 
said to be predictable with accuracy a if the output of some polynomial sized 
circuit agrees P on a fraction of the inputs.) They show that this minor modi- 
fication to the problem statement suffices to construct hardcore predicates from 
any one-way function. 

One parameter of some interest in the construction of hardcore predicates is 
the length of the auxiliary random string. Let I denote this parameter. The initial 
construction of (which was based on their list-decoding algorithm for the 
Hadamard code) sets I = k. Impagliazzo m gives an substantial improvement 
to this parameter, achieving I — 0(logfc -I- log |), by using the list-decoders for 
Reed-Solomon and Hadamard codes. It turns out that both constructions can be 
described as a special case of a generic construction using list-decodable codes. 
The construction goes as follows: Let C be a (n, fc )2 binary code as given by 
Theorem El with e set to some poly(<5). Then the predicate P(x,r) = {C{x))r 
(the rth bit of the encoding of x) is as hard as required. The proof follows 
modularly from the list-decodability property of C. Specifically, if for some x, 
the prediction of the circuit agrees with P(x, •) for a ^ -I- e fraction of the values 
of i G {1 , . . . , n}, then one can use the list decoder to come up with a small list 
of candidates that includes x. Further, the knowledge of /(x) tells us how to find 
which element of the list is x. The hardness of inverting / thus yields that there 
are not too many x’s for which the circuit can predict P(x, •) with this high an 
accuracy. Now to see the effectiveness of Theorem El note that the extra input 
has length log n, which by the theorem is only 0 (log k + log y). 

Aside: Recall that the early results of Blum and Micali and Alexi, Chor, 
Goldreich, and Schnorr Q that gave hardcore predicates for specific one-way 
functions (namely. Discrete Log and RSA) actually use I = 0 extra randomness. 
It would be interesting to see if these specific results can also be explained in 
terms of list-decoding. 

Predicting witnesses for NP-search problems. Gonsider an NP-complete 
relation such as 3-SAT. Kumar and Sivakumar P3, based on earlier work of 
Gal, Halevi, Lipton, and Petrank [m, raise the question of whether it is possible 
to efficiently construct a string x that has non-trivial proximity to a witness of 
the given instance. For general relations in NP they show that if some string 
with distance 5 + e can be found; then NP=P. They show this result using 
the list-decoding algorithms for Reed Solomon and Hadamard codes. Again this 
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result can be explained easily using Theorem 0 as follows: Construct an NP 
relation whose instances are, say, instances of satisfiability but whose witnesses 
are encodings, using a code obtained from Theorem^, of satisfying assignments. 
Given a string that has close proximity to a valid witness, one can recover a 
small set of strings one of which includes the witness. 

Amplifying hardness of Boolean functions. One of the recent success stories 
in complexity theory is in the area of finding complexity theoretic assumptions 
that suffice to derandomize BPP. In a central result in this direction, Impagliazzo 
and Wigderson 1221 , show that a strong form of the assumption “if does not have 
subexponential sized circuits” implies BPP=P. One important question raised 
in this line of research is on amplification of the hardness of Boolean functions. 
Specifically, given a Boolean function / : {0, 1}* — >■ {0, 1}, transform it into a 
Boolean function /' : {0, 1}^°*' ' — >■ {0, 1} such that if no small circuit computes 
/, then no small circuit computes f' on more than ^ + e fraction of the inputs. 

(23 give such a transformation which goes through a sequence of of trans- 
formations: one from Babai, Fortnow, Nisan and Wigderson one from Im- 
pagliazzo m, and a new one original to \‘2‘2\ . Again this step can be modularly 
achieved from error-correcting codes efficiently list-decodable under the implicit 
input/output model, as follows (from Sudan, Trevisan, and Vadhan j,32j ) : Think 
of / as a 2^ bit string and encode this string using an error correcting code. 
Say the encoded string is a 2* bit string. Then this function can be thought 
of as the truth table of a function /' : {0, 1}^ — ?> {0, 1}. It follows from the 
list-decodability properties of the error-correcting code that /' is highly unpre- 
dictable. Specifically, suppose C is a circuit predicting /'. Then C is an implicit 
representation of a received word that is close to /'; thus list-decoding, in the 
implicit output model, yields a small circuit computing /' (and with some work, 
a small circuit encoding /). The strength of this transformation is again in its 
efficiency. For example, Impagliazzo, Shaltiel, and Wigderson note that this 
construction is also significantly more efficient in some parameters and use this 
aspect in other derandomizations of BPP. 

Direct product of NP-complete languages. Let SAT be the characteri- 
stic function of the satisfiability language. I.e., SAT(<()) = 1 if ((> is a satisfiable 
formula and 0 otherwise. Let SAT* be the /-wise direct product of the SAT 
function. I.e., it takes as input I formulae . . ,4>i and outputs the /-bit vec- 
tor SAT(^i), . . . ,SAT((()i). Clearly SAT* is at least as hard to compute as SAT. 
Presumably it is much harder. In fact if SAT were hard to compute on more 
than I — (5 fraction of the instances chosen from some distribution, then SAT* 
would be hard to compute with probability more than (1 — 5)* on the product 
distribution. Unfortunately, no NP-complete problem is known to be NP-hard 
when the inputs are chosen at random. In the face of this lack of knowledge, 
what can one say about SAT*? This topic is studied in the complexity theory 
literature under the label of membership comparability. Sivakumar m gives a 
nice hardness for this problem. He shows that if it is even possible to efficiently 
compute the least amount of information about SAT*, for /(n) = O(logn) then 
NP=RP. Specifically, if some polynomial time algorithm, on input an instance (p 
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of SAT*, rules out even one string out 2* as the value of SAT*(0), then it can be 
used to decide satisfiability. Sivakumar m uses the list-decodability properties 
of the Reed Solomon codes and a version of Sauer’s lemma. Simplifying the proof 
slightly it is possible to get it as a direct consequence of Theorem 0 applied to 
g = 2* and e = 2“^*, without use of Sauer’s Lemma (see m- 

Permanent of random matrices. In a striking result, Lipton showed 
how it is possible to use the fact that the permanent is a low-degree polynomial 
to conclude the following. If it is easy to compute the permanent of an n x n 
matrix modulo a prime p > n, with high probability when the matrix is chosen 
at random, then it is also easy to compute the permanent of any n x n matrix 
modulo p. One of the first results to establish the average-case hardness of a 
computationally hard problem, this result laid down the seed for a series of 
interesting results in complexity theory including IP=PSPACE and the POP 
characterizations of NP. 

Subsequent results strengthened the average case hardness of the permanent 
to the point where it suffices to have an algorithm that computes the permanent 
modulo p on an inverse polynomially small fraction of the matrices, as shown by 
Cai, Pavan and Sivakumar |H|. Their result uses the list-decoding algorithm for 
Reed Solomon codes. Independently Goldreich, Ron and Sudan [El strengthened 
this result in different direction. They show it suffices to have an algorithm that 
computes the permanent correctly with inverse polynomial probability when 
both the matrix and the prime are chosen at random. Their result uses the list 
decoding algorithm for the Reed Solomon codes as well as that for the Chinese 
Remainder code. It turns out that the techniques of |B| extend to this problem 
also, thus giving an alternate proof that does not use the list-decoder for the 
Chinese remainder code (but still uses the list-decoder for the Reed-Solomon 
code). 

6 Sample of Algorithms 

Here we attempt to briefly sketch the algorithmic ideas needed to get, say Theo- 
rem 0 (and thus covering all applications of Section 0 ). To get this result, we 
need the list-decoder for Reed-Solomon codes and Reed-Muller codes and some 
glue to patch the details. The reader is warned that this section is highly sketchy 
and many details are skimmed over without explicit notice. 

6.1 Decoding Reed-Solomon Codes 

Say we have an (n, k)q Reed-Solomon code obtained by taking degree k — 1 
polynomials and evaluating them at points xi, ... ,Xn € E, where A is a field 
on q elements. Note that the list decoding problem here turns into the problem: 
Civen n pairs {(a;i, ri), . . . , (x„, r„)}, find all degree k — 1 polynomials p such 
that p{xi) = Ti for at least n — e values of i. 

We describe the algorithm for the case when e < n — k^/n. The algorithm 
works in two steps. (1) Find a non-zero bivariate polynomial Q(x,r) of degree 
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at most ^/n in x and r, such that Q{xi,ri) = 0 for every i. (2) Factor Q into 
irreducible factors. For every irreducible factor of the form r — q{x), check if 
q{xi) = Ti for at least n — e value of i and output it if so. 

First note that both steps can be implemented efficiently. The first step amo- 
unts to solving a homogeneous linear system to find a non-trivial solution, and 
hence can be solved efficiently. The second step is implementable efficiently as 
a consequence of efficient factorization algorithms for multivariate polynomi- 
als given by Chistov and Grigoriev, Kaltofen, or Lenstra (see survey article by 
Kaltofen 1231 for pointers). 

To verify correctness, we first need to verify that step (1) will return some 
polynomial Q. This is true since Q has more than n coefficients and thus the 
homogeneous linear system has more variables than constraints; and hence has a 
non-trivial solution. Now let p be a degree fc — 1 polynomial and S C {1, . . . , n} 
be a set of cardinality at least n — e such that p{xi) = Ti for every i G S. We 
claim r — p{x)\Q{x,r). We prove this using the “division algorithm over unique 
factorization domains” which says this is the case iff Q{x,p{x)) = 0. To see this, 
let g{x) — Q{x,p{x)) and note that for every i G S, g{xi) = Q(xi,p(xi)) = 
Q(xi, Ti) = 0. But p is a polynomial of degree at most ky/n that is zero on more 
than ky/n points. Hence g is identically zero, as required. Thus we have that 
y — p{x) does divide Q and hence p will be included in the output list. 

6.2 Implicit Decoding of Reed-Muller Codes 

We first describe a solution for the case when the error is relatively small (small 
enough to guarantee unique solutions) . Say we have access to the received word r 
as an oracle mapping 27™ to 27. Say p is a polynomial of degree I that agrees with 
r in n—e places. We wish to design a (randomized) oracle program that computes 
p. Suppose we wish to compute p{i) for i G 27™. Pick j € 27™ at random and 
consider the line I = {l{0) = {l — 6)i+9j\6 G 27}. We note that p restricted to this 
line, i.e., the function p\i{0) = p{l(9)) is a univariate polynomial in 9 of degree at 
most 1. Further the value we are interested in p|z(0). The crucial observation here 
is that for any fixed 0 yf 0, l{9) is a random point in 27™ and thus r{l{9) is very 
likely to equal to p\i{9). Thus applying the univariate polynomial reconstruction 
algorithm (i.e. the list decoding algorithm for Reed-Solomon codes) to the points 
{{9,r{l{9)))\9 G 27} is very likely (over the random choice of j) to yield the 
polynomial p|;; evaluating this at 0 yields p(i). Summarizing, the randomized 
oracle program, call it that implicitly describes p works as follows: (1) 

Picks J G 27* at random and sets I = {(1 — 9)i + 9j\9 G 27}. (2) Uses the 
univariate reconstruction algorithm to compute (explicitly) the polynomialp|;(-). 
(3) Outputs p|/(0). 

In attempting to extend this algorithm to higher error, the main problem 
faced is the lack of information about p. Above, we used the fact that p had 
been implicitly specified by the oracle r and the fact that it was a degree I 
polynomial. But when e is large, many polynomials may agree with r on n — e 
places, and the polynomial p is not yet specified uniquely. In other words, if 
Pi , . . . , Pt are all the polynomials that agree with r in n — e places, then it is easy 
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to extend the above algorithm into one that outputs a small set that includes 
the values {pi{i) , . . . , pt{i)} . However it is hard to create the oracle for, say 
p = Pi- Among other things, it is not clear what distinguishes pi from any of the 
other polynomials in the list. To create this distinction, we need some additional 
information about pi. Say we knew the value of pi at some point j, and let 
Pi{j) = a. Then we may modify the algorithm of the previous paragraph to get 
a new one, call it as follows: (!’) Set I = {{l — 0 )i + 9 j}. (2’) Use univariate 
reconstruction algorithm to compute a list of univariate polynomials qi, ... ,qt 
that agree with r on approximately 1 — e/n fraction of the points on the line 1 . 
(3’) If there exists a polynomial qk in this list such that <7fc(l) = cr, then output 
9fc(0), else do anything. 

(r) 

It can be shown that with high probability over the choice of j, Aj correc- 
tly computes p for most choices of i. This part requires some analysis and we will 
skip it (see (32|)- Combining this with the self-corrector of the first paragraph, 

we get that C' iMj)' is a probabilistic oracle machine for p. If a running time 
of poly(g) does not bother us, we may simply guess p{j) by running through all 
possible choices; else better methods can give us a shorter list of candidates. 

It may be easily verified that the running time of the decoder is polynomial 
in q and m (while n is g™). For careful settings of q and m, the running time 
becomes poly logarithmic in n. 

6.3 Decoding Concatenated Codes 

Finally, it is easy to guess a simple strategy for list-decoding concatenated codes, 
given list-decoding algorithms for the outer and inner code. We describe a natural 
algorithm, without giving any proofs. However the proofs can be worked out as 
an exercise. The decoding algorithm for the concatenation of an (ni, ki)gk2 outer 
code with an {ri2,k2)q inner code may work as follows: Given a 9-ary string of 
length nin2, first list-decode the rii strings of length ri2 corresponding to the 
inner code. In each case pick a random element of the list and thus create a 
9^^ -ary string of length rii corresponding to the outer code. List decode this 
string. Repeat if necessary. 

Now to see why these ideas suffice to yield TheoremO, we take the code to be 
the concatenation of an outer Reed-Muller code and inner Hadamard code, with 
careful choice of code parameters. The list-decoder for the Reed-Muller code has 
already been described. For our purposes, the brute-force list-decoder for Ha- 
damard codes will suffice at the inner level. By choosing appropriate thresholds 
for the list-decoding of the inner code, some relatively straightforward analysis 
yields Theorem El 

7 Concluding Thoughts 

By now we have seen many applications of algorithms for list-decoding. The 
notion of list-decoding itself, never mind the algorithmic results, is a very im- 
portant one for complexity theory. The recent beautiful result of Trevisan m, 
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gives strong evidence of the role that this theme can play in central questions in 
complexity/extremal combinatorics. (For those few of you who may have missed 
this development, Trevisan showed how to construct a strong family of extrac- 
tors by combining binary codes that have very good combinatorial list-decoding 
properties, with a pseudo-random generator of Nisan and Wigderson . This 
construction and its successors, see Raz, Reingold and Vadhan |2H|, and Impa- 
gliazzo, Shaltiel, and Wigderson ED, reach optimal characteristics for various 
choices of parameters.) We hope other applications of list-decoding will continue 
to emerge as the notion becomes more popular. 

We conclude with some open questions relating to combinatorial list-decoding 
performance of error-correcting codes. The combinatorial question relating to 
list-decoding posed in this survey was chosen carefully to allow for a tight pre- 
sentation of results (in Theorem 0 and its corollary). However, it is much more 
interesting to study list-decoding characteristics of specific codes and here we 
know very little (Part (1) of Theorem 01 applies, of course, but Part (2) is irrele- 
vant to specific codes). For example, Ta-Shma and Zuckerman [33j . have shown 
that the random (non-linear) code has polynomially many codes in any ball of 
radius e, for e very close to the minimum distance of the code. The existence 
of such codes with good list-decoding properties raises the question of whether 
such codes exist with small description size; and if so can they be construc- 
ted and/or decoded efficiently. One could ask such questions about the classes 
of codes described in this paper. For example, what is the largest error e for 
an (n, k)q Reed-Solomon code for which the Hamming ball of radius e around 
any received word has only poly(n) codewords. The best known bound is still 
given by Part (1) of Theorem 0 In fact the following question remains very inte- 
resting: Let e);/''(i5) be defined analogously to eoo(i5), however restricted to linear 
codes. As before we know 1 — \/l — S < e);/''((5) < S. However we know very little 
beyond this point. (In some recent work in progress, Guruswami, Hastad, Sudan, 
and Zuckerman CHI, have shown that the analogous quantity e^‘^(S) is strictly 
smaller than S for every choice of S and c, however the difference in their proof 
vanishes as c — >■ oo. Thus a number of questions relating to the combinatorics 
of the list-decoding problem remain open. Depending on the answers to these, a 
number of algorithmic challenges could also open up. Thus the area seems rife 
for further exploration. 

Finally, a word of warning. This survey is very much a reflection of the aut- 
hor’s current state of knowledge, or lack thereof. As the state of this knowledge 
improves, the survey will hopefully get updates and if so the updated copy will 
be available from the author’s website http://theory.lcs.mit.edu/~madhu. 
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Abstract. We present polynomial-time approximation algorithms for 
string folding problems over any finite alphabet. Our idea is the following: 
describe a class of feasible solutions by means of an ambiguous context- 
free grammar (i.e. there is a bijection between the set of parse trees 
and a subset of possible embeddings of the string) ; give a score to every 
production of the grammar, so that the total score of every parse tree 
(the sum of the scores of the productions of the tree) equals the score 
of the corresponding structure; apply a parsing algorithm to find the 
parse tree with the highest score, corresponding to the configuration with 
highest score among those generated by the grammar. Furthermore, we 
show how the same approach can be extended in order to deal with an 
inhnite alphabet or different goal functions. In each case, we prove that 
our algorithm guarantees a performance ratio that depends on the size 
of the alphabet or, in case of an infinite alphabet, on the length of the 
input string, both for the two and three-dimensional problem. Finally, 
we show some experimental results for the algorithm, comparing it to 
other performance-guaranteed approximation algorithms. 



1 Introduction 

We present performance-guaranteed approximation algorithms for different ver- 
sions of the string folding problem. The motivation of string folding problems 
comes mainly from computational biology. One of the greatest challenges for 
computational biologists nowadays is to determine the three-dimensional native 
structure of a protein starting from the amino acid sequence that composes it. 
The problem has been studied from many different viewpoints, and many models 
have been proposed. Theoretical models are abstractions of the folding process 
that emphasize the effect of some factors while hiding other aspects. Perhaps the 
simplest and most studied model is the two-dimensional hydrophobic-hydrophilic 
(HP) model introduced by Dill p. In this model, the amino acid residues are 
grouped in two classes, according to their chemical properties: the hydropho- 
bic, i.e. non-polar, and hydrophilic, i.e. polar. The protein instance can be thus 
reduced to a binary sequence of H’s (meaning hydrophobic) and P’s (meaning 
hydrophilic). Furthermore, to reduce the number of possible configurations, the 
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Fig. 1. Two-dimensional embedding of the string abcabcdcb. The score of the embed- 
ding is 4. 



conformational space is discretized into a square lattice. Feasible structures are 
therefore mappings (embeddings) of the string into the grid, where adjacent 
symbols of the string lie on adjacent nodes, and no node is occupied by more 
than one symbol. It has been observed experimentally that hydrophobic amino 
acids tend to group inside the native structure, shielded from the environment 
by the hydrophilic ones. Thus, an optimal configuration for the protein is one 
that maximizes the number of H’s that are in contact on the lattice, that is, lie 
on adjacent nodes of the lattice but are not adjacent in the input string. The 
study of structures generated by this and other theoretical models can provide 
useful insights into the dynamics of the folding process j5|. 

In this paper, we also deal with string folding problems of a more general type. 
We are given as input a string over some alphabet. The score of an embedding 
of the string is the number of equal symbols of the string that are in contact on 
the grid. Figure Q shows an example. Usually, a neutral symbol is included in 
the alphabet. Contacts between neutral symbols do not contribute to the score 
of an embedding. For example, in the HP model P is the neutral symbol, and 
the score of the embeddings is determined only by the contacts between H’s. 
The problem is to find the embedding of the string with the maximum score. 
The three-dimensional version of the problem is defined in the same way; in this 
case, strings are mapped into the three-dimensional rectangular grid. 

The string folding problem over any alphabet (finite or infinite) is NP- 
hard both in the two and three-dimensional case f.4l4l6j . Moreover, the three- 
dimensional version has been proved to be MAX-SNP hard |H| . 

The algorithms we present are suitable for more specialized discrete models of 
the folding of biological sequences, where the goal function does not depend only 
on contacts between equal symbols, or where contacts between equal symbols 
have different weights. For example, they could be applied to the string folding 
problem over an alphabet of twenty symbols, representing the twenty amino 
acids that build proteins, or to the RNA folding problem over an alphabet of 
four symbols. Although approximation algorithms for the HP model have already 
been proposed m to our knowledge these are the first performance-guaranteed 
algorithms for the generalized problems. Moreover, the same approach could be 
easily extended to other discrete or non-discrete models, where the goal function 
does not necessarily depend on the contacts between equal symbols, as long as 
the correspondence between parse trees and feasible structures is preserved. 
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2 The Algorithm 

We will now show the basic algorithm, that works on any finite alphabet. Let 
S be the alphabet, with |S| = fc, let Oj, 0 < z < fc — 1 be the symbols of the 
alphabet. Let oq be the neutral symbol, included in the alphabet. Our algorithm 
is based on the following steps: 

1. Define an ambiguous context-free grammar, that generates all the possible 
instances of the problem (i.e., every string belonging to S*). 

2. Define a relation between the derivations of the grammar and a subset of all 
the possible embeddings, where every production of a derivation recursively 
corresponds to a layout on the lattice of the terminal symbols generated by 
the production itself. 

3. Assign to every production of the grammar an appropriate score, represen- 
ting (a lower bound to) the number of contacts between equal (but not 
neutral) symbols generated by the spatial position of the symbols associated 
with the production. 

4. Given an instance of the problem, apply a parsing algorithm in order to find 
the parse tree with the highest score (computed as the sum of the scores of 
the productions of the tree), that is, the tree corresponding to the embedding 
of maximum score among those that can be generated by the grammar. 

Let us now introduce the grammar we employed in our algorithm. We defined 
a context-free grammar G = {T, N, S, P}, where: 

1. T = {S U zz} is the set of terminal symbols, where zt is a dummy terminal 
symbol whose function will be explained later. 

2. N = {S,L,R} is the set of nonterminal symbols. 

3. R is the start symbol, i.e. the root of every parse tree. 

4. P is the set of the productions, composed by the following production sche- 
mes: 

(1) S ^ t\ S t2 

(2) S — y t\ L t2 S t^ L t^ 

(3) S — y t\ L t2 S tsG 

(4) S — y t\t 2 S t^ L t 4 . 

(5) S — y t\t 2 

(6) S — )■ t\ L ^2^3 L G 

(7) S — >■ t\t2t3 L G 

(8) S — y t\ L ^ 2 ^ 3 ^ 

(9) P — y ti P t 2 

(10) P — y t\t 2 
with ti £ Tt] 

and by the following productions, that do not involve symbols from S: 

(11) S — >■ Suu 

(12) S ^ uu 

(13) R^SS 
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The layout of the terminal symbols associated with each production is shown 
in Fig. 13 The proof that every parse tree corresponds to a feasible structure is 
straightforward. The score of every production is increased by one every time 
two equal non-neutral symbols generated by the production are in contact. For 
example, the production S — ?> aiLaiSaiLai has score four, S — ?► aiLaiSaoLao 
has score one, since neutral symbols do not contribute to the score, and so 
on. Possible contacts between equal symbols generated by different productions 
cannot be added to the score of the parse tree. 

It could be argued that this grammar generates only sequences of even length. 
To solve this problem, and to avoid adding further productions to the grammar, 
in case of a sequence s of odd length the string actually parsed is s* = sag. 
In fact, it can be proved that the best embedding among those that can be 
generated by the algorithm for the original sequence s is the structure found by 
the algorithm for s*, with the final neutral symbol removed. 

The algorithm builds structures in which the sequence is folded onto itself 
twice (see Fig. OJ. The parse tree is split into two sub-trees, whose roots are the 
two S symbols generated by the start symbol R. The symbols generated by each 
sub-tree form a structure shaped like an “U”, giving an overall configuration 
similar to a “C” . If the length of the string is even, the first and last symbol are 
always in contact. Terminals generated by S nonterminals form the “backbone” 
of the structure, while symbols generated by L nonterminals form lateral bran- 
ches. The introduction of the dummy terminal symbol u allows the grammar to 
generate a larger set of structures. The string actually parsed (after the possible 
addition of a neutral symbol) is = suu. If the second sub-tree contains only 
the production S — >■ uu, the first sub-tree generates the whole sequence, which is 
again folded once to form a structure shaped like a “U” . Without this extension 
the algorithm would not be able to generate U-shaped structures, with a signi- 
ficant decrease on its performance ratio (take for instance the string PHPPHP 
in the HP model, whose optimal structure is U-shaped). 



3 The Parsing Algorithm 

The parsing algorithm is based on the Earley algorithm for context-free gram- 
mars 0, and it is similar to the version that computes the Viterbi parse of a 
string generated by a stochastic grammar proposed by Stolcke m- It preserves 
the worst case time (0(n^)) and space (0{n^)) complexity of the two algorithms. 

The Earley parser keeps a set of states for each symbol in the input, describing 
all pending derivations. A state has the form: 

i : fcA — y X.fi 

where A is a nonterminal symbol of the grammar, A and ^ are strings of terminals 
or nonterminals, such that X — >■ A/r is a production of the grammar, i and k 
are indices into the input string. The i indicates that the state belongs to the 
set associated to the i-th symbol of the input string. The k indicates that the 
nonterminal X has been expanded starting from the fc-th symbol of the input. 
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Fig. 2. Production schemes and corresponding layout of the symbols and scores, oo is 
the neutral symbol. 
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Fig. 3. Structure generated by the algorithm (score 11 ) for the sequence 
uiUon2UoUoU2UoUoU2UonoU2UoUi uiUon2UoUoU2UoUoU2UoaoU2UoUi and corresponding 

parse tree. Contacts between equal symbols are shown by dots (•). S = {00,01,02}, 
where oq is the neutral symbol that does not contribute to the score of the embedding. 



and the right-hand side of the production has been expanded up to the position 
indicated by the dot. A state with the dot at the end of the right-hand side is 
called a complete state, since the dot indicates that the left-hand nonterminal 
has been completely expanded. 

The algorithm is based on three steps that scan the input string from left to 
right and build new states starting from the current set of states and the current 
input symbol. The three steps, given in input a string s = sq . . . s„_i, work as 
follows. 



Prediction For each state 



t ‘ — y pi 

where T is a nonterminal, and for all the productions Y ^ v oi the grammar, 
add the state: 

i : iY ^ .V 
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It can be seen that every prediction corresponds to a potential expansion of a 
nonterminal in a left-most derivation. A state generated by this step is called a 
predicted state. 



Scanning For each state 



i : fcA — >■ A.a/i 

where a is a terminal symbol that matches the current input symbol Sj, add the 
state: 



t -t“ 1 : fcA — )■ XcL.fi 

that is, move the dot one position to the right in the right-hand side. This 
ensures that terminals generated by the productions match the input string. 



Completion For each complete state: 

i : jY — >■ li. 

and for each state in the set j < i 

j ■ kX — >■ X.Y fi 

with the nonterminal Y after the dot, add the state: 

i : izX — )■ XY.fi 

A state generated by the completion step is called a completed state. A completed 
state corresponds to the fact that one of the nonterminal symbols in the right- 
hand side has been completely expanded (starting from a prediction step) and 
has generated a sub-string of the input string. The algorithm performs the three 
operations described above exhaustively, that is, until no new states can be 
generated. 

The algorithm starts from an initial dummy state, whose left-hand side is 
empty: 



0 : 0 — ^ -dd 

{R is the start symbol of the grammar). Then, the states corresponding to R 
(and the possible states deriving from the productions with R on the left-hand 
side, and so on) are predicted, and the first scanning step examines the first 
symbol of the input string. 

After scanning the last symbol of the string, and performing the correspon- 
ding completion step, the algorithm checks whether the state 



Ti : 0 — ^ A. 
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is contained in the last set of states that has been produced, where n is the length 
of the input string. This means that the start symbol R has been completely 
expanded in order to build the input string, that is, the string belongs to the 
language generated by the grammar. If during the computation any set of states 
remains empty, the algorithm aborts, indicating that a prefix of the input string 
that cannot be generated by the grammar has been detected. 

The only difference between our grammar and a context-free grammar is the 
introduction of a score associated with each production. We modified Earley’s 
algorithm in order to compute the derivation that generates the input string 
with the highest score. Basically, we added to each state a score, as follows: 



i : ^ A./i 

Score = p 

Intuitively, the idea is to have at the end of the parsing a state of the form: 

n : Q — ^ R. 

Score = h 

where h is the score corresponding to the highest-score parse tree among those 
that can be generated for the input string. In order to obtain this result, we 
modified the three steps of the algorithm as follows. 



Prediction For each state 



i : — y A.E p 

Score = p 

where T is a nonterminal, and for all the productions T — >■ i/ of the grammar, 
add the state: 



i : iY ^ .V 

Score = 0 

that is, all predicted states have their score set to zero. 



Scanning For each state 

i : — y X.ctp 

Score = p 

where a is a terminal symbol that matches the current input symbol Sj, add the 
state: 



t T 1 : ^ Xci.fi 

Score = p 



that is, scores are left unchanged by the scanning step. 
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Completion The actual update of the scores takes place during the completion 
step. The score of the complete states changes as follows. For each eomplete 
state: 



i : jY — >■ V. 

Score = p 



the score becomes: 



Score = p + q 

where q is the score associated to the production Y ^ v. That is, we add 
to the score of the state (corresponding, as we will see, to the score of the 
structures generated by the nonterminals contained in the right-hand side of 
the production) the score corresponding to the layout of the terminal symbols 
generated by the production itself. Then, for each subset of the current set 
containing the states . . . Sm of the form: 

i : jY — >■ V. 

Score = qi 

that is, a subset containing complete states with the same nonterminal on the 
left-hand side that were predicted at the same j, and for each state: 

j : j^X — >■ A.T /i 

Score = p 

add the state: 

i : izX — )■ XY.pi 
Score = p + q* 

where q* = maxi<i<m{®}. That is, we add to the score of the state the score 
corresponding to the expansion of the nonterminal symbol Y with the highest 
score, i.e. the parse sub-tree with root Y with the highest score. It can be proved 
recursively that, at the end, the algorithm will associate the start symbol R with 
the score of the parse tree with the highest score. Moreover, the best parse tree 
can be reconstructed by assigning to each expanded nonterminal symbol of a 
completed state the corresponding complete state with the highest score. 

4 Dealing with an Infinite Alphabet 

In case of an infinite alphabet, the basic algorithm cannot be applied, since it 
would be impossible to generate a priori all the productions of the grammar and 
their corresponding scores. The solution we adopted is the following. We let the 
parser work only on the production schemes, and we build the productions on 
the fly while scanning the input string. The productions contained in the states 
of the parser have the terminal symbols specified only on the left side of the 
dot. That is, states generated during the prediction step of the parser contain. 
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instead of terminal symbols, a special wildcard symbol (*). Wildcard symbols 
are replaced by terminals during the scanning step. For example, suppose we 
are scanning a given symbol Si of the input string. Each state with a wildcard 
symbol after the dot: 



i : ^ ^ ^Ji 



generates a new state: 



* + 1 : kX — ^ Xsi-fi 

where A is composed by terminal or nonterminal symbols of the grammar, and 
fj, is composed only by wildcard or nonterminal symbols (it does not contain 
symbols belonging to S, since it still has to be expanded). 

Also, the score of the productions cannot be computed in advance. As we have 
seen, the score of a state is set to zero, until the state is completed or complete. For 
complete states, the score of the corresponding production is computed according 
to the rules shown in Fig. I3 and then added to the score of the state itself, as 
in the finite case. When a state is completed, i.e. a nonterminal in its right- 
hand side has been completely expanded, the score is updated by adding the 
score of the maximum parse sub-tree generated, once again as in the finite case. 
Moreover, the scoring scheme for the productions can be easily changed in order 
to deal with different goal functions. 



5 Performance Results 



In this section, we prove some results concerning the performance of the algo- 
rithm when applied to the different versions of the problem. 

In order to have a performance-guaranteed approximation algorithm, for 
every possible instance of the problem the ratio between the score of the struc- 
ture generated by the algorithm and the score of the optimal structure must be 
bounded by a constant. That is, for every possible sequence s of arbitrary length 
we must have: 



TZ{s) 



Ajs) 

OPT{s) 



> n 



( 1 ) 



where A(s) is the score of the structure generated by the algorithm when given 
as input the sequence s, and OPT{s) is the score of the optimal embedding. We 
will call TZ the absolute performance ratio of the algorithm, and denote with TZk 
the absolute performance ratio when dealing with an alphabet of size k. 



5.1 Binary Alphabet 

We will start from the performance ratio of the algorithm over a binary alphabet, 
i.e., in the HP model. Given a string s = sq . . . Sn, where Si G{H, P}, two symbols 
Si and Sj can be in contact on the grid only if |j — *| is odd. Furthermore, every 
symbol can be in contact with at most two other symbols, except when it is 
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located at one of the endpoints of the sequence. In this case, it can be in contact 
with three other symbols. 

Now, let he be the number of H’s in even positions in a given sequence s; ho 
the number of H’s in odd positions; h* = min{he,ho}- We also define OPT{s) 
as the score (the number of contacts between H’s) of the optimal embedding for 
a given sequence s. The above considerations yield the following: 

Theorem 1. 

OPT{s) <2h* + 2 (2) 

It can be observed that the upper bound 2h* + 2 can be reached only by 
sequences of odd length with two H’s at the endpoints. We also can give a lower 
bound on the number of contacts that are generated by the algorithm. 

Lemma 1. Given a sequence s, there always exists an embedding for s, corre- 
sponding to a parse tree generated by the algorithm, that contains |" ^ contacts 
between H’s. 



The proof of this lemma is quite cumbersome. Actually, we have been able to 
prove that, in the set of the structures that can be generated by the algorithm, 
there always exists a structure with contacts, but not, for example, that 

this is the best structure of the set. Thus, we could give only a lower bound on 
the actual performance ratio of the algorithm. In fact, as shown in Section El 
the worst case that we found experimentally gave a performance ratio of 3/8. 
This will also affect, as we will see, the performance ratio for the more general 
versions. From the result of Lemma ^ however, it is straightforward to obtain 
the performance ratio of the algorithm. 



Theorem 2. 



rtP+li 

7^2> ^ 



2h* 



1 

> - 
2 - 4 



( 3 ) 



5.2 Finite Alphabet 

We will now show the results concerning the performance of the algorithm ap- 
plied to the problem over any finite alphabet. Let s be a string of symbols taken 
from an alphabet S, with |S| = k. Also, let a° and cr|, mboxl < i < fc— 1 be the 
number of the occurrences of the symbol G S in the sequence s respectively 
in an odd and in an even position (we do not consider the number of occurren- 
ces of the neutral symbol uq). Finally, let a* = min{(7°, erf}, 1 < i < A: — 1, and 
a* = maxi<i<fc{cr*}. 

Lemma 2. For any sequence s over an alphabet S with |S| = k, 



k-l 

OPT{s) <2'^a* + 2 
2=1 



(4) 
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By considering the symbol that corresponds to a* the only non-neutral sym- 
bol of the alphabet, we can easily prove the following lemma. 

Lemma 3. For any sequence s over a finite alphabet, the set of structures ge- 
nerated by the algorithm contains a structure that generates at least |" 
contacts. 

We want to point out that once again the structure of Lemma 0 does not 
correspond to the best structure that can be generated by the algorithm. In fact, 
the algorithm tries to generate contacts not only with the symbol corresponding 
to cr*, but with all the non-neutral symbols, as shown in Fig. El LemmaQ gua- 
rantees only three contacts between 02 symbols, while the solution found by the 
algorithm contains eleven contacts, and corresponds to the optimal embedding. 
However, starting from the previous two lemmas, the worst case is a sequence 
where the number of occurrences of every non-neutral symbol is equal. There- 
fore, we have OPT{s) < 2a* {k — 1) -|- 2. This fact yields the following theorem: 

Theorem 3. 

V > > ^* + 1 > 1 

^ “ 2 + 2 ~ 4[cr*(fc - 1) -I- 1] “ 4(/c - 1) 



5.3 Infinite Alphabet 



For the proof of the performance ratio of the algorithm applied to the problem 
over an infinite alphabet, we start from the fact that, although the size of the 
alphabet is not bounded, the input sequence is finite. Therefore, the number of 
different symbols occurring in the sequence that can generate contacts is also 
finite. Let d be this number. The terms a* and a* are defined as in the previous 
section, but in this case we have 1 < i < d, with d < n/2. Thus, the performance 
ratio of the algorithm on a given input s of length n can be defined as follows. 



Theorem 4. 



TZ„ 






> 



a* -\-l 



> 



2Eti< + 2 “ -1) + 1] - 4(f -1) 



( 6 ) 



5.4 The 3D Case 

Although structures generated by our algorithm are two-dimensional, they any- 
way guarantee a performance ratio also for the three-dimensional problem. In 
the three-dimensional lattice, each non-neutral symbol can be in contact with at 
most four equal symbols (five, if it is located at the endpoints of the sequence). 
This fact yields the following lemma. 

Lemma 4. For any sequence s over an alphabet S with |S| = k, 

k-l 

OPT{s) <4:'^ a* + 2 
2=1 



( 7 ) 
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This, together with Lemma |3 gives the absolute performance ratio TZ'ff' for 
the three-dimensional problem over an alphabet of size k. 

Theorem 5. 



> I 2 I > ^ ^ ^ /O'! 

' -4Et>*+2-8[a*(fc-l) + l]-8(fc-l) 

It is worth mentioning that in the case of a binary alphabet the absolute per- 
formance ratio of our algorithm (1/8) equals the best approximation algorithm 
known for the three-dimensional problem 

6 Experimental Evaluation 

In the two-dimensional binary case, our algorithm equals the performance ratio 
of the best algorithms so far proposed jZj. Therefore, we have tested it on random 
instances of the two-dimensional problem over a binary alphabet, and compa- 
red the results to the other algorithms (see Table Pi. Given Ph — Pr[si = H], 
Vz G [0,n], for different values of Ph we have completed 1000 runs of our algo- 
rithm and of the other two with performance-guaranteed ratios of 1/4 (called 
B and C as in the original paper), on instances of length 63 with two H’s at the 
endpoints (in order to reach the higher bound for the goal function). 

The performance ratio of our algorithm seems to decrease as the average 
number of H’s in the sequence is increased. The same trend, even if with lower 
ratios, is shown by algorithm C, while algorithm B has a constant ratio. In the 
tests, the worst case ratio of 1/4 has been reached only by algorithm B, while 
on sequences like PP(HPP)^^“*'^, fc > 3 (whose optimal score is 4fc), algorithm 
C produces structures with score A: -I- 3. Thus, its performance ratio approaches 
1/4 as fc is increased p. It should be noted that our algorithm found the optimal 

Table 1. Average performance ratios of algorithms B and C P, and our algorithm 
{CFG) in the two-dimensional case of the problem over a binary alphabet, for different 
values of Ph = Pr[si = £ [0,n]. 



Algorithm 


B 


C 


CFG 


Ph = 


.15 


0.52 


0.60 


0.79 


Ph = 


.33 


0.48 


0.57 


0.72 


Ph = 


.5 


0.48 


0.55 


0.68 


Ph = 


.66 


0.48 


0.53 


0.63 


Ph = 


.85 


0.48 


0.50 


0.55 


Average 


0.48 


0.55 


0.67 


Worst 1 


3ase 


0.25 


0.33 


0.375 
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solution on this set of instances. The worst case found for our algorithm is 3/8: 
this, as discussed in the previous sections, leaves open the issue of its performance 
ratio. 

7 Conclusions 

We presented polynomial-time approximation algorithms for the string folding 
problem that guarantee performance ratios both for the two- and three-dimen- 
sional case, and work over any alphabet, finite and infinite. To our knowledge, 
these are the first performance-guaranteed approximation algorithms that can 
be applied to the problem over alphabets larger than the binary one, while for 
the latter we equaled the performances of the best algorithms so far proposed, 
with better experimental results. In each case, the performance ratios that have 
been proved serve only as a lower bound for the actual ones. Our approach can 
also be easily extended to different goal functions, and also to more powerful 
grammars and non-discrete versions of string folding problems, as long as the 
correspondence between parse trees and feasible structures is preserved. 
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Abstract. We present an index to search a two-dimensional pattern 
of size m X m in a two-dimensional text of size n x n, even when the 
pattern appears rotated in the text. The index is based on (path com- 
pressed) tries. By using 0{n^) (i.e. linear) space the index can search the 
pattern in 0{{log^ time on average, where a is the alphabet size. 

We also consider various schemes for approximate matching, for which 
we obtain either O(polylog^n) or 0(n^^) search time, where A < 1 in 
most useful cases. A larger index of size 0{n^ (log^ ^)^^^) yields an aver- 
age time of 0(log^ n) for the simplest matching model. The algorithms 
have applications e.g. in content based information retrieval from image 
databases. 



1 Introduction 

Two dimensional pattern (image) matching has important applications in many 
areas, ranging from science to multimedia. String matching is one of the most 
successful special areas of algorithmics. Its theory and potential applications 
in the case of one-dimensional data, that is, linear strings and sequences, is 
well understood. However, the string matching approach still has considerable 
unexplored potential when the data is more complicated than just a linear string. 
Two dimensional digital images are an example of such a data. 

Examples of combinatorial pattern matching algorithms that work in two 
dimensions, but do not allow rotations are e.g. mnsismm- On the other 
hand, there are many non-combinatorial approaches to rotation invariant pat- 
tern matching, for a review, see e.g. HSl. The only combinatorial methods, that 
come close to us in some respects, are 1 1 211 4j . However, these do not address the 
pattern rotations. As stated in |2|, a major open problem in two-dimensional 
(combinatorial) pattern matching is to find the occurrences of a two-dimensional 

* Work supported by ComBi. 

** Work developed while the author was in a postdoctoral stay at the Dept, of Compu- 
ter Science, Univ. of Helsinki. Partially supported by the Academy of Finland and 
Fundacion Andes. 

* * * Work supported by the Academy of Finland. 



J. van Leeuwen et al. (Eds.): IFIP TCS2000, LNCS 1872, pp. 59-^^ 2000. 
© Springer- Verlag Berlin Heidelberg 2000 



60 



K. Fredriksson, G. Navarro, and E. Ukkonen 



pattern of size m x m in a two-dimensional text of size n x n when the pattern 
can appear in the text in rotated form. This was addressed from a combinatorial 
point of view first in |3 , in which an online algorithm allowing pattern rotations 
was presented. 

In this work we give the first algorithms for offline searching, that is, for 
building an index over the text that allows fast querying. The data structure we 
use is based on tries. Suffix trees for two-dimensional texts have been considered, 
e.g. in IHisiini. The idea of searching a rotated pattern using a “suffix” array of 
spiral like strings is mentioned in HOI, but only for rotations of multiples of 90 
degrees. The problem is much more complex if we want to allow any rotation. 

In |3] the consideration was restricted to the matches of a pattern inside the 
text such that the geometric center of the pattern has been put exactly on top 
of the exact center point of some text cell. This is called the “center-to-center 
assumption”. Under this assumption, there are 0{w?) different relevant rotation 
angles to be examined for each text cell. In this paper we make this assumption, 
too, and consider the following four matching models: 

Exact: the value of each text cell whose center is covered by some pattern cell 
must match the value of the covering pattern cell. 

Hamming: an extension of the Exact model in which an error threshold 0 < 
k < w? is given and one is required to report all text positions and rotation 
angles such that at most k text cells do not match the covering pattern cell. 
Grays: an extension of the Exact model more suitable for gray level images: the 
value of each text cell involved in a match must be between the minimum 
and maximum value of the 9 neighboring cells surrounding the corresponding 
pattern cell 0. 

Accumulated: an extension of the Hamming model, more suitable for gray 
levels. The sum of the absolute differences between the value of the text 
cells involved in a match and the values of the corresponding patterns cells 
must not exceed a given threshold k. 

Our results are summarized in Table ^ For some algorithms we have two 
versions, one with pattern partitioning technique, and one without it. We denote 
by cr the alphabet size and assume in our average case results that the cell 
values are uniformly and independently distributed over those a values. The 
times reported are average-case bounds for the search. In the Hamming and 
Accumulated models a = k/m? (note that a < 1 for Hamming and a < a 
for Accumulated) and k*jj and k\ denote the maximum k values up to where 
some techniques work: k'^ = k/{l — eja) and k\ = kj[al(2e) — 1). Moreover, 
= -alog,^(a) - (1 - a)log„(l ~ «) and R^(a) = -alog,^(a) -k (1 -k 
a) log£,(l -k a). According to Table D the search times are sublinear on average 
when the conditions are met, which implies in particular that a < 1 — ej a for 
Hamming and a < crj{2e) — 1 for Accumulated. In all the cases the index needs 
0{n^) space and it can be constructed in average time 0(n^ log^, n). 

We have also considered the alternative model in which the pattern centers 
are used instead of the text centers. For this case we obtain an index that for 
the Exact model needs 0(n^ (log„, space and gives 0{log^n) time. 
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Model 


Search time 


Condition 


Exact 


(log,, n)"''" 


'TTm'^ 14 > log^ 


Hamming 


(21og„ 


j 4 > log^ > k*^ 


Hamming (pattern partitioning) 


^ 2 (cx+H^ (“))m^/ log^ n 


7rm^/4 > log^ > k*^ 


Grays 




7rm^/4 > log^ 


Accumulated 


+2(log^ „)/fe)'=(log^„)3/2 


7rm^/4 > log^ > k\ 


Accumulated (pattern partitioning) 




7rm^/4 > log^ > k\ 



Table 1. Time complexities achieved under different models. 



The algorithms are easily generalized for handling large databases of images. 
That is, we may store any number of images in the index, and search the query 
pattern simultaneously from all the images. The time complexities remain the 
same, if we now consider that v? denotes the size of the whole image library. 

2 The Data Structures 

Let T = T[l..n, l..n] and P = P[l..m, l..m] be two dimensional arrays of point 
samples, such that m < n. Each sample has a color in a finite ordered alphabet 
S. The size lifl of P is denoted by a. The arrays P and T are point samples 
of colors of some “natural” image. There are several possibilities to define a 
mapping between T and P, that is, how to compare the colors of P to colors 
of T. Our approach to the problem is combinatorial. Assume that P has been 
put on top of T, in some arbitrary position. Then we will compare each color 
sample of T against the color of the closest sample of P. The distance between 
the samples is simply the Euclidean distance. This is also technically convenient. 
The Voronoi diagram for the samples is a regular array of unit squares. 

Hence we may define that the array T consists of unit squares called 
cells, in the real plane (the (a;, y)-plane). The corners of the cell for T[i,j] 
are {i — l,j — — l),(t — l,j) and (i,j). Each cell has a center which 

is the geometric center point of the cell, i.e., the center of the cell for T\i,j] is 
(i — i J — i) . The array of cells for pattern P is defined similarly. The center of the 
whole pattern P is the center of the cell in the middle of P. Precisely, assuming 
for simplicity that m is odd, the center of P is the center of cell 
For images, the cells are usually called pixels. 

Assume now that P has been moved on top of T using a rigid motion (trans- 
lation and rotation), such that the center of P coincides exactly with the center 
of some cell of T. The location of P with respect to T can be uniquely given as 
{{i — \,j — where (i — ^,j — ^) is the location of the center of P in T, and 

6 is the angle between the cc-axis of T and the a;-axis of P. The occurrence (or 
more generally, distance) between T and P at some location, is determined by 
comparing the colors of the cells of T and P that overlap. We will use the centers 
of the cells of T for selecting the comparison points. That is, for the pattern at 
location ((t — \,j — ^),0), we look which cells of the pattern cover the centers 
of the cells of the text, and compare the corresponding colors of those cells. As 
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the pattern rotates, the centers of the cells of the text move from one cell of P 
to another. In it is shown that this happens 0{m^) times, so there are 0{rn?) 
relevant orientations of P to be checked. The actual comparison result of two 
colors depends on the matching model. 

We propose to use a trie based index of the text, defined as follows. Trie is a 
well-known tree structure for storing strings in which each edge is labeled by a 
character. Each cell of the text defines a string which is obtained by reading text 
positions at increasing distances from the center of the cell. The first character 
is that of the cell, then come the 4 closest centers (from the cells above, below, 
left and right of the central cell), then the other 4 neighbors, and so on. The 
cells at the same distance are read in some predefined order, the only important 
thing is to read the cells in the order of increasing distance from the center cell. 
This effectively utilizes the 0{w?) result to restrict the number of rotations our 
algorithms must consider on average, see Sec. 01 If such a string hits the border 
of the text it is considered finished there. We will call sistrings (for “semi-infinite 
strings”) jlDj the strings obtained in this way. Figure E shows a possible reading 
order. 
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Fig. 1. A possible reading order ( “spiral” ) for the sistring that starts in the middle of 
a text of size 5x5. Figure (a) shows the reading order by enumerating the cells, and 
hgure (b) shows the enumeration graphically. Figure (c) shows the color values of the 
cells of the image, and for that image the sistring corresponding to the reading order 
is < 3, 2, 19, 2, 6, 7, 5, 5, 28, 3, 12, 1, 12, 13, 31, 1, 56, 1, 9, 23, 22, 2, 2, 3, 4 > 



Therefore each text cell defines a sistring of length O(n^). A trie on those 
strings (called the sistring trie) can be built, which has average size 0{n^) and 
average depth O(log^n^). Alternatively, the unary paths of such a trie can be 
compressed, in similar manner used to compress the suffix trees. In such a tree 
each new string adds at most one leaf and one internal node, so the worst case 
size is O(n^). 

Still another possibility is to construct an array of pointers to T, sorted in 
the lexicographic order of the sistrings in T. Such an array, called the sistring 
array, can be formed by reading the leaves of a sistring trie in the lexicographic 
order, or directly, by sorting the sistrings. The array needs 0{n^) space, but is 
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much smaller that the sistring trie or tree in practice. Hence the sistring array is 
the most attractive alternative from the practical point of view and will therefore 
be used in the experiments. 

The sistring trie can be built in log^^ n) average time, by level-wise 

construction. The sistring array can be built in 0(n^ log n) string comparisons, 
which has to be multiplied by 0(log^ v?) to obtain the average number of cha- 
racter comparisons. The sistring array is very similar to the suffix array, which 
in turn is a compact representation of a suffix tree. 

For simplicity, we describe the algorithms for the sistring trie, although they 
run with the same complexity over sistring trees. For sistring arrays one needs 
to multiply the search time results by O(logn) as well, because searching the 
array uses binary search. 

We consider now a property of the sistring trie that is important for all the 
results that follow. We show that under a uniform model, the number of sistring 
trie nodes at depth £ is 6>(min(cr^, n^)). This roughly is to say that in levels 
i < h, for h = log^{n?) = 2log^ n all the different strings of length £ exist, while 
from that level on the 0{n^) sistrings are already different. In particular this 
means that nodes deeper than h have 0(1) children because there exists only 
one sistring in the text with that prefix of length h (note that a sistring prefix is 
graphically seen as a spiral inside the text, around the corresponding text cell). 

To prove this property we consider that there are sistrings uniformly 
distributed across tr^ different prefixes, of length £, for any £. The probability of 
a prefix not being “hit” after attempts is (1 — 1/tr^)" , so the average number 
of different prefixes hit (i.e. existing sistring trie nodes) is 

a^(l-(l-l/aO"') = CT^(1 - = a^(l - e"") 

for X = 0{v?' ja^). Now, if in? = o(cr^) then x = o(l) and 1 — e~^ = 1 — (1 — 
X -I- 0(x^)) = 0(x) = 0(n^ja^), which gives the result O(n^). On the other 
hand, if = l7(cr^) then x = 0(1) and the result is 0(cr^). Hence the number 
of sistring trie nodes at depth £ is on average 0(min(cr^, n^)), which is the same 
as the worst case. Indeed, in the worst case the constant is 1, i.e. the number of 
different strings is at most min(tr^,n), while on average the constant is smaller. 
We need this result for giving bounds for the maximum number of sistring trie 
nodes inspected by our algorithms. 



3 The Exact Model 

We first describe the algorithm for the exact matching model. The other algo- 
rithms are relatively straight-forward extensions of it. As shown in 0, there 
are 0{rr?) relevant orientations in which the pattern can occur in a given text 
position. A brute force approach is to consider the 0{rr?) pattern orientations 
in turn and search each one in the sistring trie. To check the pattern in a given 
orientation we have to see in which order the pattern cells have to be read so 
that they match the reading order of the sistring trie construction. 
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Figure El shows the reading order induced in the pattern by a rotated occur- 
rence, using the spiral like reading order given in Figure ^ For each possible 
rotation we compute the induced reading order, build the string obtained by 
reading the pattern in that order from its center, and search that string in the 
sistring trie. Note in particular that some pattern cells may be read twice and 
others may not be considered at all. Observe that in our example the cells num- 
bered 30, 32, 34, and 36 are outside the maximum circle contained in the pattern, 
and are therefore ignored in the sistring trie search. This is because those values 
cannot be used unless some levels are skipped in the search, which would mean 
entering into all the branches after reading cell number 20. Text cells 21-29, 31, 
33, 35, and 37- all fall outside the pattern. 

The algorithm first considers the sistring of P for angle 0 = 0. The sistring 
is searched from the trie until some cell of P mismatches, at depth Now 
the pattern must be rotated a little in order to read the next sistring. The next 
rotation to try is such that any of the first £ cells of the previous sistring changes, 
that is, any of the centers of the cells of T hits some border of the first I sistring 
cells of P. 
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(a) (b) (c) 

Fig. 2. Reading order induced in the pattern by a rotated occurrence. Figure (a) shows 
the pattern superimposed on the image. Figure (b) shows the enumeration of the indu- 
ced reading order, and Figure (c) shows the color values for the pattern cells. The cor- 
responding sistring is < 3, 2, 4, 2, 6, 7, 5, 5, 12, 9, 19, 9, 5, 6, 7, 3, 1, 7, 1, 1, 3 >. Cells num- 
bered 30, 32, 34, and 36 are ignored in the trie search. 



The number of rotations to try depends on how far we are from the center. 
That is, the number of the text centers that any cell of P may cover depends on 
how far the cell is from the rotation center. If the distance of some cell of P from 
the rotation center id d, then it may cover 0{d) center of T. In general, there 
are 0{w?) rotations for a pattern of size m x m cells. The number of rotations 
grows as we get farther from the center, and they are tried only on the existing 
branches of the sistring trie. As the pattern is read in a spiral form, when we 
are at depth i in the sistring trie we are considering a pattern cell which is at 
distance 0{Vi) from the center. This means that we need to consider 
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different rotations if the search has reached depth 1. For each rotation we assume 
in the analysis that the sistring is read and compared from the beginning. 

The fact that on average every different string up to length h exists in the 
sistring trie means that we enter always until depth h. The number of sistring 
trie nodes considered up to depth h is thus 

^^3/2 = 0(h5/2) 

1=1 

At this point we have 0{h^^^) candidates that are searched deeper in the 
sistring trie. Now, each such candidate corresponds to a node of the sistring 
trie at depth h, which has 0(1) children because there exist 0(1) text sistrings 
which share this prefix with the pattern (a “prefix” here means a circle around 
the pattern center). 

Two alternative views for the same process are possible. First, consider all 
the 0(/i3/2) candidates together as we move to deeper levels £ > h in the sistring 
trie. There are on average r? jcr^ sistrings of length t matching a given string, so 
the total work done when traversing the deeper levels of the sistring trie until 
the candidates get eliminated is 

£>h+l £>1 



An alternative view is that we directly check in the text each candidate that 
arrived to depth h, instead of using the sistring trie. There are candidates 

and each one can be checked in 0(1) time: if we perform the comparison in a 
spiral way, we can add the finer rotations as they become relevant. The £-th 
pattern cell in spiral order (at distance V£ from the center) is compared (i.e. the 
comparison is not abandoned before) with probability £?!’^ j . Summing up the 
probabilities of being compared over all the characters yields 



E 




0 ( 1 ) 



where for simplicity we have not considered that we have compared already h 
characters and have an approximate idea of the orientations to try. 

Therefore we have a total average search cost of 0((log^ n)®/^). This assumes 
that the pattern is large enough, i.e. that we can read h characters from the cen- 
ter in spiral form without hitting the border of the pattern. This is equivalent to 
the condition ^ log„r which is a precondition for our analysis. Smaller 

patterns leave us without the help of the sistring trie long before we have elimi- 
nated enough candidates to guarantee low enough average time to check their 
occurrences in the text. 
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4 The Hamming Model 

This model is an extension of the exact matching model. An additional parameter 
k is provided to the search algorithm, such that a mismatch occurs only when 
more than k characters have not matched. In this section we use a = k/vn? . We 
require 0 < a < 1, as otherwise the pattern would match everywhere. 

The problem is much more complicated now. Even for a fixed rotation, the 
number of sistring trie nodes to consider grows exponentially with k. To see this, 
note that at least k characters have to be read in all the sistrings, which gives 
a minimum of 0{<j^) nodes. This means in particular that \i k> h then we will 
consider the sistrings and the index will be of no use, so we assume k < h; 
still stricter conditions will appear later. We first present a standard technique 
and then a pattern partitioning technique. 



4.1 Standard Searching 

Imagine that for each possible rotation we backtrack on the sistring trie, ente- 
ring into all the possible branches and abandoning a path when more than k 
mismatches have occurred. As explained, up to depth k we enter into all the 
branches. Since h > k, we have to analyze which branches we enter at depths 
k < £ < h. Since all those strings exist in the sistring trie, this is the same as 
to ask how many different strings of length I match a pattern prefix of length i 
with at most k mismatches. 

A pessimistic model assumes that there are (^) ways to choose the cells that 
will not match, and cr^ selections for them. As we can replace a cells by itself we 
are already counting the cases with less than k errors. The model is pessimistic 
because not all these choices lead to different strings. To all this we have to add 
the fact that we are searching 0{£^^^) different strings at depth £. Hence, the 
total number of sistring trie nodes touched up to level h is 



e=i 




o(.3/vQ) 



For the second part of the search, we consider that there are on average n^/cr^ 
sistrings of length £ > h equal to a given string. So the total amount of nodes 
touched after level h is 






l>h+l ^ l>h+l 

U\ 1 



k j <7 



l-k 



for 



In the Appendix E] we show that is exponentially decreasing with £ 



£ > k*H = 



1 — e/cr 



while otherwise it is l7(l/\/£). 
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Therefore, the result depends on whether or not h > k*fj. If h < then 
the first term of the summation alone is f2{hn^) and there is no sublinearity. If, 
on the other hand, h > k'^, we have an exponentially decreasing series where 
the first term dominates the whole summation. That is, the cost of the search 
in levels deeper than h is 

which matches the cost of the first part of the search as well. Therefore, the 
condition for a sublinear search time is k’^ < h < m^. This in particular implies 
that a < 1 — e/a. 



4.2 Pattern Partitioning 



The above search time is still polylogarithmic in n, but exponential in k. We 
present now a pattern partitioning technique that obtains a cost of the form 
0{n^^) for A < 1. The idea is to split the pattern in pieces (j divisions across 
each coordinate). If there are at most k mismatches in a match, then at least 
one of the pieces must have at most errors. So the technique is to search 

for each of the pieces (of size (m/j) x (m/j)) separately allowing k/ errors, 
and for each (rotated) match of a piece in the text, go to the text directly and 
check if the match can be extended to a complete occurrence with k errors. Note 
that the a for the pieces is the same as for the whole pattern. 

The center-to-center assumption does not hold when searching for the pieces. 
However, for each possible rotation of the whole pattern that matches with the 
center-to-center assumption, it is possible to fix some position of the center of 
each piece inside its text cell. (The center of the piece is ambiguous, as there 
are infinitely many angles for the matching pattern: there are O(m^) different 
relevant rotations of the pattern, and between the corresponding angles, there are 
infinitely many angles where the occurrence status does not change. However, 
any of the possible positions for the center of the pieces can be chosen). The 
techniques developed to read the text in rotated form can be easily adapted to 
introduce a fixed offset at the center of the matching subpattern. Therefore we 
search each of the pieces in every of the 0{m^) different rotations. 

The search cost for this technique becomes times the cost to search a 
piece (with a fixed rotation and center offset) in the sistring trie and the cost to 
check for a complete occurrence if the piece is found. 

If we consider that (m/j)'^ < h, then all the strings exist when a piece is 
searched. Therefore the cost to traverse the sistring trie for a piece at a fixed 
rotation is equivalent to the number of strings that can be obtained with k 
mismatches from it, i.e. 



U = 



({m/jf\ 
\ k/3^ ) 



a 






while the cost to check all the U candidates is Ukn ^ , i.e. k times per 
generated string times the average number of times such a string appears in the 
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text. Therefore the overall cost is 






U 



Uk- 



■(vn/jY 



where (after distributing the initial factor) the first term decreases and the se- 
cond term increases as a function of j. The optimum is found when both terms 
meet, i.e. j = mj ^2 log^ n which is in the limit of our condition < h. In 

fact, the second term is decreasing only for a < 1 — e/cr, otherwise the optimum 
is j = 1, i.e. no pattern partitioning. 

For this optimal j, the overall time bound becomes 

O ( ^ («)) 

where we have written H^{a) = —alog^{a) — (1 — a) logg.(l — a). 

This bound is sublinear as long as a < 1 — eja. On the other hand, we can 
consider to use a larger j, violating the assumed condition {mjj)'^ < h in order 
to reduce the verification time. However, the search time will not be reduced 
and therefore the time bound cannot decrease. 



5 The Grays Model 

In the Grays model a match requires that the color of text cell must be between 
the minimum and maximum pattern colors in a neighborhood of the pattern cell 
that corresponds to the text cell. In this case, we do not enter into a single branch 
of the sistring trie, but for each pattern cell we follow all branches where color is 
between the minimum and maximum neighbor of that pattern cell. The number 
of characters qualifying for the next pattern character is a random variable that 
we call A, where 1 < A < a. 

Since there are now 0{A^) possible strings that match the pattern prefix of 
length i, we touch 

^iV^A^ = 0{h^/^A^) 

1=1 

sistring trie nodes up to depth h, because all those qualifying strings exist up to 
that depth. From that depth on, there are on average 0{n^ fa^) sistrings in the 
text matching a given string of length £. Therefore, the work in the deeper part 
of the sistring trie is 

= 0{h^/^A^) 

t>h ^ \ ^ J 

since the first term of the summation dominates the rest. Therefore, the total 
complexity is 

0{h^/^A^) = o((log,n)3/2n2i°s^^) = O ((log, 5/4)^ 
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Here the last step is based on that A, the average value of A, equals (4/5) cr 
as the difference between the maximum and minimum of 9 values uniformly 
distributed over a On the other hand, the cost function is concave in terms of 
A, and hence f{A) < f{A). In practice A is much less than (4/5)cr, see 0. 

6 The Accumulated Model 

Even more powerful model is the Accumulated model, which provides a Hamming- 
like matching capability for gray-level images. Here, the sum of the absolute 
differences between text colors and the color of the corresponding pattern cell 
must not exceed k. 

As for the Hamming model, we have to enter, for each relevant rotation, into 
all the branches of the sistring trie until we obtain an accumulated difference lar- 
ger than k. We present first a standard approach and then a pattern partitioning 
technique. 



6.1 Standard Searching 

We enter into all the branches of the sistring trie until we can report a match or 
the sum of the differences exceeds k. As we show in Appendix m the number of 
strings matching a given string of length i under this model is at most 2^ • 

Since up to length h all them exist, we traverse 




nodes in the trie. For the deeper parts of the trie there are 0(r? ja^') strings 
matching a given one on average, so the rest of the search takes 



^£ 3/2 

e>h 



k + e 



= ^£ 3/2 

l>h 



2^ 




In Appendix ini we show that (2/cr)^(^^^) is exponentially decreasing with ^ 
for fc/£ < crl{2e) — 1, otherwise it is Therefore, we define 

1 * ^ 

“ cr/(2e) - 1 

and a h < k\ the summation is at least 0{hn^) and therefore not sublinear. If, 
on the other hand, h > k\, then the first term of the summation dominates the 
rest, for a total search cost of 



which is sublinear in n for cr > 2. On the other hand, cr = 2 means a bilevel image, 
where the Hamming model is the adequate choice. Hence we obtain sublinear 
complexity (albeit exponential on k) for k\ <2 log^. n. 
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6.2 Pattern Partitioning 

As for the Hamming model, we can partition the pattern to subpatterns that 
are searched exhaustively in the sistring trie. Again considering (jn/j)'^ < h we 
have a total search cost of 



( U 



Uk- 



■(vn/jY 



where this time 

U = 2(-/^)" 

After distributing the initial factor of the cost formula, we see that the first 
term decreases and the second term increases as a function of j. The optimum is 
found when both terms meet, which is again j = mj which is consi- 
stent with our condition (jnj < h. In fact, the second term is decreasing only 
for a < aj{2e) — 1, otherwise the optimum is j = 1, i.e. no pattern partitioning. 
For this optimal j, the overall complexity is 

/ 2ilog„2+H^(a)) 

Vlog.n 

where we have defined H^{a) = — alog^(o;) -I- (1 -I- a) logcr(l + ct). 

This complexity is sublinear as long as a < a/ (2e) — 1 . Again, we can consider 
to use a larger j value but the complexity does not improve. 



7 An Alternative Matching Model 

We have considered up to now that text centers match the value of the pattern 
cells they lie in. This has been done for technical convenience, although an 
equally reasonable alternative model is that the pattern cells must match the 
text color where their centers lie in the text. 

Except for the Grays model, all the algorithms considered can be adapted to 
this case. The algorithms are more complex in practice now, because there may 
be more than one pattern center lying at the same text cell, and even no pattern 
center at all. This means that in some branches of the sistring trie we may have 
more than one condition to fit (which may be incompatible and then the branch 
can be abandoned under some models) and there may be no condition at all, in 
which case we have to follow all the branches at that level of the trie. 

On average, however, we still have 0{£) conditions when entering in the 
sistring trie with a pattern string of length £, and therefore all the time bounds 
remain the same. However, in the Exact matching model, we can do better using 
the pattern centers. 

We consider now indexing the rotated versions of the text sistrings, instead of 
considering the rotated versions of the pattern at search time. Hence, the pattern 
is simply searched with no rotations. Imagine that we index all the rotations of 
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the text up to depth H. This means that there will be sistrings, and 

the sizes of the sistring trie and array will grow accordingly. 

The benefit comes at search time: in the first part of the search we do not 
need to consider rotations of the pattern, since all the rotated ways to read the 
text are already indexed. Since we index strings now, all the different 

sistrings will exist until depth h' — log^(n^iJ^/^) = 2 log^. n+3/2 log^ H. We first 
assume that H > h' . This means that until depth H we pay 0{H). After that 
depth all the surviving rotations are considered. Since H > h' , they all yield 
different strings, and the summation, as in Section 01 yields / a^). 

Therefore the total search time is 



O 




aH ) 



which is optimized for H — 2 log^, n + (1/2) log^, H. Since this is smaller than h' 
we take the minimal H — h' . For instance H = xlog^ n works for any x > 2. 

This makes the total search time 0(log^ n) on average. The space complexity 
becomes now 0(n^(log^ Trying to use H < h' worsens the complexity. 

The matching model has changed, however. In the normal index the text 
sistrings are indexed once at a fixed rotation (zero). When a given pattern rota- 
tion is tried, the pattern is read in rotated form, in an order driven by the text 
centers. Now the text sistrings are read in all the rotated forms, and the pattern 
will be read once. The way to index a text sistring in rotated form is to assume 
that a rotated pattern is superimposed onto it and read the text cells where the 
pattern cells, read in order, fall. This effectively corresponds to the model we 
are considering in this section. 



8 Experimental Results (Preliminary) 

We have implemented the algorithms for Exact and Accumulated models, with- 
out the pattern partitioning technique. For the index structure, we used sistring 
array. The array based implementation is much more space efficient, but the 
search cost is also higher (both asymptotically and by the constant factors). 

The implementation is in C, compiled using gcc 2.95.2 on Linux 2.0.38, 
running in 700MHz Pentiumlll machine. The implementation is not very opti- 
mized, and much of the time is spent in trigonometry; for computing the next 
angle to try, and for computing the coordinates of the cells for the given angle. 
Our test text was an image of size 768 x 768 cells, with 35 colors (gray levels), 
and a pattern of size 41 x 41 was extracted from it. 

Table 0 shows some timings for the search. The difference between the times 
of the Exact model and the Accumulated model with k = 0 reveals the more 
complex implementation of the Accumulated model. Our results show that the 
algorithms can be implemented, and although preliminary versions, they run 
reasonably fast. For comparison, our optimized 0(n^(fc/cr)^/^) expected time 
on-line algorithm 0 spends 0.81 seconds for fc = 0, 1.67 seconds for k = 8, 3.62 
seconds for k = 192, and 3.85 seconds for k = 256. With the pattern partitioning, 
the algorithms would be much faster for large k. 
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k 


Exact 


0 


1 


2 


4 


8 


16 


32 


64 


96 


128 


192 


256 


time 


0.0055 


0.0086 


0.0088 


0.0089 


0.0090 


0.0097 


0.0110 


0.0165 


0.0567 


0.1720 


0.3930 


2.0390 


6.3910 



Table 2. Experimental results for the Exact and Accumulated models. The times are 
given in seconds. 



9 Conclusions and Future Work 

We have proposed a sistring tree index to search two dimensional patterns in 
two dimensional texts allowing rotations. We have considered different matching 
models and obtained average time bounds that are sublinear for most reasonable 
cases. 

It is possible to extend the model by removing the center-to-center assump- 
tion In this case the number of patterns grows as high as 0{mJ) and therefore 
there are sistrings to search at depth The search time for the Exact 

model becomes O(log^n)®/^. By indexing all the rotations and center displace- 
ments we get 0(log^ n) time again, but at a space cost of 0(n^ (log,„ 

It is also possible to extend the methods to three dimensions |E|. With the 
center-to-center assumption we have rotations. This means 

sistrings at depth i. Therefore, at 0{n^) space the total search time becomes 
0((log^ for exact searching. If we index all the rotations up to = 

xlog^ n with X > 3 we will have a space requirement of 0(n^ (log^ and a 

search cost of O(log^n). For the Grays model we have 0((log^ 

(28/27)^ time. All the other techniques can be extended as well. 
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A Probability of Matching under the Hamming Model 



We need to determine which is the probability of the search being active at a 
given node of depth I in the sistring trie under the Hamming model. We are 
therefore interested in the probability of a pattern prefix of length (. matching 
a text substring of length f. For this to hold, at least I — k text characters text 
must match the pattern. Hence, the probability of matching is upper bounded 

by 

1 /A 



where the combinatorial counts all the possible locations for the matching cha- 
racters. 

In the analysis that follows, we call P = k/£ and take it as a constant (which 
is our case of interest, as seen later). We will prove that, after some length £, 
the matching probability is 0{j{PY), for some 7(/3) < 1. By using Stirling’s 
approximation a;! = {x/eY'/2jrx{l J- 0{l/x)) over the matching probability we 
have 

1 f fV^£ \ ^ 



which is 



7^-'^ I A(£ - kY~>^V^^/2Tr{£ -k)l V 



a^-f3p0(l- py-0 



/2^/3(l - P) 
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This formula is of the form 7(/3)^ 0(1/V^), where we define 



7(x) 



1 



Therefore the probability is exponentially decreasing with i if and only if 
7(/3) < 1, that is, 



cr > 



a^(\-s5Y-p 



1 

1-/3 



1 



It is easy to show analytically that < (3^^ < 1 if 0 < /3 < 1, so it suffices 
that a > e/{l — P), or equivalently, /3 < 1 — e/cr is a sufficient condition for the 
probability to be exponentially decreasing with £. 

Hence, the result is that the matching probability is very high for P = kjl > 
1 — c/ct, and that otherwise it is 0{'^{PY where 7(/3) < 1. 



B Probability of Matching under the Accumulated Model 

We need to determine what is the probability of the search being active at a 
given node of depth £ in the sistring trie under the Accumulated model. We are 
therefore interested in the probability of two random strings of length £ matching 
with at most k errors. Our model is as follows: we consider the sequence of £ 
absolute differences between both strings Si .. .5i. The matching condition states 
that ^ *■ 

The number of different sequences of differences satisfying this is what 

can be seen as the number of ways to insert £ divisions into a sequence of k 
elements. The £ divisions divide the sequence in f + 1 zones. The sizes of the 
first £ zones are the Si values and the last allows the sum to be < fc instead of 
exactly k. Note that we are pessimistically forgetting about the fact that indeed 
Si < a. 

Finally, each difference Si can be obtained in two ways: Pi + Si and Pi — Si (we 
again pessimistically count twice the case Si = 0). Therefore, the total matching 
probability is upper bounded by 

^/£ + k\ 

\ k J 

In the analysis that follows, we call P = k/£ and take it as a constant (which 
is our case of interest, as seen later). We will prove that, after some length £, 
the matching probability is 0{j{PY), for some ^{P) < 1. By using Stirling’s 
approximation a;! = {x / eY V2nx{l + 0{l/x)) over the matching probability we 
have 

2 ^ / (k + £Y+WMk + Y \ (^1 , 

\ k^e^/2YkV2Yi )\ \^) ) 
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which is 



I ) 





This formula is of the form 0(1/ Vi), where we define 






2{l + x)^+^ 

(TX^ 



Therefore the probability is exponentially decreasing with i if and only if 
7(/3) < 1, that is, 



2 ( 1 + / 9 ) 

cr 




< 1 



It can be easily seen analytically that (1 + l//3)^ < e, so /3 < cr/(2e) — 1 is 
a sufficient condition for the probability to be exponentially decreasing with i. 

Hence, the result is that the matching probability is very high for j3 = kjl > 
cr/(2e) — 1, and that otherwise it is iVi), where 7(/3) < 1. 
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Abstract. An edge coloring is an assignment of colors to the edges of a 
graph so that no two edges with a common vertex have the same color. 
We show that, given an undirected tree T with n vertices, a minimum 
edge coloring of T can be determined in 0(y/n) time on a y/n x y/n mesh- 
connected computer(MCC) by a novel technique which decomposes the 
tree into disjoint chains and then assigns the edge colors in each chain 
properly. The time complexity is optimal on MCC within constant factor. 



1 Introduction 



A mesh-connected computer(MCC) consists of n identical processing elements( 
PE’s) arranged on a y/n x y/n array, where each PE is connected to its four 
neighbors □ (See Fig. Cl) We assume that MCC functions in SIMD mode, 
where all the PE’s are synchronized under one control unit. MCC have been 
used as a model of parallel computation for problems in diverse areas including 
sorting, graph theoretic problems, computational geometry, image processing |2J 
l3tH5IBl7l . In this paper, we consider the problem of edge coloring in graph theory 
on MCC. 

An edge coloring is an assignment of colors to the edges of a graph G = (P, E) 
so that no two edges with a common vertex have the same color. A minimum 
edge coloring is an edge coloring which uses a minimum number of colors. Many 
researchers have worked on the design of sequential algorithms for finding a 
minimum edge coloring of a bipartite graph [1011111211311 4| . The best known 
sequential algorithm for edge coloring of a bipartite graph is due to Cole and 
Hopcroft HOI and runs in 0(|i?|/o(/|P|) time. For a minimum edge coloring of a 
tree T with n vertices, Mitchell and Hedetniemi presented an 0(n) sequential 
algorithm H3|. 

Using the idea presented in cuni, Lev, Pippenger and Valiant implemen- 
ted an 0{log^n) parallel algorithm for a minimum edge coloring of a bipartite 
graph on ERE W (Exclusive Read and Exclusive Write) computation model m 
However, their algorithm cannot be directly implemented for tree on MCC in 
optimal time. As far as we know, no papers have proposed an optimal parallel 
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Fig. 1. 4 X 4 Mesh-connected computer 



algorithm for the minimum edge coloring of a tree so far. In this paper, we consi- 
der the problem of minimum edge coloring of a tree with n vertices and present 
an 0{y/n) parallel algorithm for solving the problem on a ^/ri x y/ri MCC, which 
is optimal within constant time factor. We shall show that this can be achieved 
by a novel technique which decomposes the tree into disjoint chains and then 
assigns the edge colors in each chain properly. 

In section 2, we describe some notations and definitions, and in section 3 
basic operations used in the design of the parallel algorithm. In section 4, we 
give an 0{y/n) parallel edge coloring algorithm for a tree T{V,E), \V\ = n, on 
a y/n X y/n MCC and in section 5 we give a conclusion. 

2 Notations and Definitions 

Let T(y,E) be a rooted directed tree, where there is a unique path from the 
root to each vertex in V . The vertices in V are labeled from 1 to \V\. If (i,j) 
is in edge set E, then i is the parent of j, denoted by p{j), and j is a child of 
i. Note that for each vertex i, its parent p(i) is unique, whereas it may have 
more than one child. The depth of a vertex u in M is the length(i.e., number 
of edges) of the path from the root to v. The degree of a vertex u in T is the 
number of the edges incident upon v in T. Let d be the maximum degree of a 
vertex in T. The minimum number of colors required in an edge coloring of a 
graph G is called the edge chromatic number of G. It is well known that the edge 
chromatic number of a bipartite graph G is equal to the maximum degree of G 
PJ. Since a tree is a bipartite graph, the edge chromatic number of T becomes d. 
Therefore, the minimum edge coloring of T using colors 1 through d partitions 
E into El, E 2 , Ed such that all the edges in Ei are assigned color i and no two 
edges in Ei are adjacent to each other. (See Fig. |21) 

The order number s{i), for each edge (p(i),i) in T, is defined as follows: We 
order each set S of edges with the same parent by sorting the edges (p(i),i) in S 
in ascending order of i. For each edge (p(i),i), we define s(i) to be the number of 
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Fig. 2. Edge coloring of a tree with maximum degree 4 





Fig. 3. Order numbers and color chains 



the edges in S previous to it including itself. (See Fig. E|-a.) Suppose each edge 
in T is colored with its order number. A path in T is called a color chain if it is 
a maximal path consisting solely of edges with the same colors. (See Fig. 0|-b.) 
A color chain consisting of edges with color k is called k-chain. 

A linear chain is a directed path which consists of edges each of the form 
(z, SMCc(z)), where vertex succ{i) is the immediate successor of vertex i. (See Fig. 
21) Suppose that each edge is associated with its weight. Then, the rank of an 
edge in the linear chain is the sum of the weights of its preceding edges and 
itself. Note that each color chain is a linear chain if i, for each edge (p(z),z), is 
regarded as the immediate successor of p(i). The root edge in the color chain(or 
linear chain) is the one with no predecessors. 
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weights ^ ^ - 4 ^ | ^ 

succ(i) i 

a) Linear chain 



r^intc - 4 3^2 ^1 

ranks ^ • 

root edge 



b) Ranks 

Fig. 4. Rank in a linear chain 



3 Basic Operations 



MCC consists of n PE’s indexed from 0 to n — 1, each having a local memory 
with a constant number of registers. We shall use the following basic operations 
which take 0{^/n) time on a ^/n x ^/n MCC |2ldl5l8| . We assume that input 
data of size n are distributed on a ^/n x ^/n MCC, one element per PE. 



(1) . Sorting: The elements are rearranged in nondecreasing or nonincreasing 
order of a specified key. 

(2) . Selected Broadcasting: Suppose a set S' of n elements is partitioned into 
subsets {Si, S 2 , .., Sm}- By selected broadcasting operation, the first element 
of each Si is distributed to all the other PE’s containing the elements in Si. 

(3) . RAR(Random Access Read): In a RAR, each PE specifies the key of data it 
wishes to receive, or it specifies a null key, in which case it receives nothing. 
Each PE reads a data from the PE which contains its specified key. Several 
PE’s can request the same key and read the same data. 

(4) . RAW(Random Access Write): In a RAW, each PE specifies the key of data, 
and send its data to the PE which contains its specified key. If two or more 
PE’s send data with the same specified key, then the PE with that key will 
receive the data with the minimium data or other value obtained by applying 
some commutative, associative binary operation to all the data sent to it. 

(5) . Chain-Rank: A linear chain consists of directed edges each of which is 
associated with its weight. The rank of each edge is defined as the sum of 
the weights of its predecessors in the chain. The first edge in the linear chain 
is called a root edge. Given a collection of linear chains with n edges, chain- 
rank operation finds, for each edge, its rank in the linear chain containing it. 
Figure 0 shows an example of linear chain and ranks of its edges each with 
weight 1. 




80 



C.-S. Jeong et al. 



4 Parallel Algorithm 

In this section we shall describe a parallel algorithm for finding a minimum edge 
coloring for a tree T. The basic idea of our algorithm can be described briefly as 
follows: First, color every edge of T by its order number. Note that no more than 
two edges adjacent to a vertex in T have the same color, and color d have not 
been used at all, since the maximum value of order numbers is d—1. Using those 
facts, we can obtain a minimum edge coloring by decomposing T into disjoint 
color chains and then changing the colors of every other edge in each color chain 
into d. In the following we shall describe the detailed implementation of the 
parallel algorithm. 

Parallel Algorithm Edge-Coloring(EC) 

Input: Each PE with index i on a ^/n x ^/n MCC contains an edge (p(f), f) and 
a vertex i of a directed tree T{V, E). 

Output: A minimum edge coloring C such that C{p{i),i) is m if the edge 
(p(f), j) is assigned color m. 

1) : Find a maximum degree d for T. This can be done as follows: Send, for 

each PE containing an edge (p{i),i), 1 to the PE containing a vertex p{i) 
by RAW. During RAW, several I’s may have the same destination PE, and 
the PE containing a vertex j can receive the value obtained by summing 
all the I’s sent to that PE. Therefore, each PE containing a vertex j stores 
the number of its children after executing RAW, and hence we can compute 
the degree of j by adding one to it. Then, we can find easily the maximum 
degree d by computing the largest value of degrees of all the vertices. 

2) : Compute, for each edge (p{i),i), its order number s{i). 

Implemeutatiou: First, sort every edge {p{i),i) lexicographycally according to 
(p(f), i) in nondecreasing order. During sort the smallest edge e in each set S 
of edges with the same parent is located and index I of PE containing e is sent 
to all the PE’s storing the other edges in S by selected broadcasting. Then, 
s{i) can be obtained by subtracting (/ — 1) from the PE index containing 
{p{i),i). 

3) : Assign, for each edge (p{i),i), color s{i) to C{p{i),i). (See Fig. 0-a.) The 

colors assigned at this step range from 1 to d — 1, since the maximum value 
of s(f) is d — 1. 

4) : Decompose T into disjoint k-chains, 1 < fc < d — 1. (See Fig. Gl-b.) 

luiplerueutatiou: Step 4) can be done by seperating the root edge in each color 
chain from the other color chains. This can be implemented as follows: Find, 
for each edge (p(j),*), its parent edge {j,p{i)) by RAR, where j is a parent 
of p{i). Then, check whether the color of (p(i),i) is different from that of 
(j,p(i)) or not. If it is, seperate (p(i),i) from the other color chains adjacent 
to it by replacing (p(i),i) with (p(f)s(i), *), since (p(i),i) becomes a root edge 
of the color chain containing it; otherwise (p(i), i) cannnot be a root edge of 



Parallel Edge Coloring of a Tree on a Mesh Connected Computer 



81 




a) Ranks in color chain : Bold lines 
represent edges with even ranks 




Fig. 5. Example of the parallel algorithm EC 



the color chain to which it belongs, and hence we don’t have to separate it 
from the other color chains. 

5) : Find, for each edge its rank in the color chain containing it after 

assigning a weight 1 to every edge. (See Fig.E|-a.) 

Comment: We can easily see that at most two edges with a common vertex 
may have the same color after executing step 3). In order to fix the conflicting 
color assignments, the colors of all the edges with even ranks are replaced 
with color d in the following step. 

6) : Change, for each edge {p{i),i) with the even rank, its color C{p{i), i) into d. 

(See Fig. Ob.) 

End Algorithm. 

Lemma 1: The parallel algorithm EC can be executed in 0{^/n) time on a 
^/n X y/n MCC. 

Proof: Step 1) and 2) take 0(^/n) time for RAW, sorting, and selected broad- 
casting. Step 3) can be done in 0(1) time and step 4) in 0{y/n) time for RAR. 
Step 5) can be carried out in 0{y/n) time by chain-rank operation. Step 6) takes 
0(1) time. Therefore, the overall time complexity is 0{y/n) time. 

Lemma 2: The edge coloring C in the parallel algorithm EC is a minimum edge 
coloring for T. 

Proof: Clearly, the colors used in the parallel algorithm EC range from 1 to 
d. Since d is a edge chromatic number of T, all we have to prove is that no 
two edges with a common vertex have the same color. Consider a vertex i with 
parent p(i) and m children ji,j 2 ,j 3 , ■■,jm ordered from left to right so that the 
order number of {i,jk) is k, 1 < k < m. (See Fig. E|-a.) Let A be a set of edges 
(b Jfe); 1 < fc < TO. After step 3) each edge (i,jk) is assigned color k in the range 
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a) 



b) 



Fig. 6. Illustration of Lemma 2 



from 1 to d — 1. Suppose that edge is colored £. Then, we can cosider 

the following two cases: 

Case 1): £ > m: Since all the edges in A are assigned colors all different from £, 
each edge in A becomes a root edge in the color chain containing it and has its 
rank value 1. Therefore, the color of each edge in A is not changed at step 6), 
and hence no two edges adjacent to i have the same color. 

Case 2): £ < m: Let denote an edge with the same color £ as (See 

Fig. EJ-b.) Since both of (p{i),i) and (i,je) belong to the same Cchain, one of 
them changes its color to d at step 6). All the edges in A other than have 
different colors from £, and their colors are not changed due to the fact that each 
of them is a root edge in the color chain containing it. Therefore, no two edges 
adjacent to i have the same color. 

An undirected tree with n vertices can be converted into the directed one in 
0{^/n) time on a i/n x y/ri MCC [5|. Therefore, the following theorem follows 
directly from lemma 1 and 2. 

Theorem 1: Given an undirected tree T with n vertices, its minimum edge 
coloring can be computed in 0{y/n) time on a ^/n x ^/n MCC. 

5 Conclusion 

In this paper, we have presented the parallel algorithm for finding the minimum 
edge coloring of a tree with n vertices in 0{^/n) time on a ^/n x ^/n MCC. The 
time complexity is optimal on MCC within constant factor. This was achieved 
by the smart technique which decomposes the tree into disjoint color chains 
and then changes the colors of each color chain properly. It remains open if one 
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can find an 0{^/n) algorithm for finding a maximum matching of a tree with n 
vertices on a ^/n x y/n MCC. Also open are the problems for finding a minimum 
edge coloring and maximal matching for a general bipartite graph on a -^71 x y/ri 
MCC in 0{y/n) time. We expect that our parallel algorithm developed in this 
paper may provide a base for finding the optimal parallel algorithms for those 
open problems. 
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Abstract. The problem of computing a matching of maximum weight 
in a given edge- weighted graph is not known to be P-hard or in RNC. 
This paper presents four parallel approximation algorithms for this pro- 
blem. The first is an RNC-approximation scheme, i.e., an RNC algorithm 
that computes a matching of weight at least 1 — e times the maximum 
for any given constant e > 0. The second one is an NC approximation 
algorithm achieving an approximation ratio of for any fixed e > 0. 
The third and fourth algorithms only need to know the total order of 
weights, so they are useful when the edge weights require a large amount 
of memories to represent. The third one is an NC approximation algo- 
rithm that finds a matching of weight at least ^2+2 firnes the maximum, 
where A is the maximum degree of the graph. The fourth one is an RNC 
algorithm that finds a matching of weight at least 2Z+I firnes the maxi- 
mum on average, and runs in O(logZi) time, not depending on the size 
of the graph. 

Keywords: Graph algorithm, maximum weighted matching, approxi- 
mation algorithm, parallel algorithm. 



1 Introduction 

Throughout this paper, a graph means an edge-weighted graph, unless explicitly 
specified otherwise. A matching in a graph G is a set M of edges in G such that no 
two edges in M share an endpoint. The weight of a matching M is the total weight 
of the edges in M . The maximum weighted matching (MWM) problem is to find 
a matching of maximum weight of a given graph. The maximum cardinality 
matching (MCM) problem is the special case of the MWM problem where all 
edges of the input graph have the same weight. 

For the MWM problem, Edmonds’ algorithm |Edm65| has stood as one of the 
paradigms in the search of polynomial-time algorithms for integer programming 
problems (see also |Cal8fij l. Some sophisticated implementations of his algorithm 
have improved its time complexity: for example, Gabow |Gabf)()| has given an 
algorithm using 0(n{m + nlogn)) time and 0(m) space. 

J. van Leeuwen et al. (Eds.): IFIP TCS2000, LNCS 1872, pp. 84-^^ 2000. 
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The parallel complexity of the MWM problem is still open; Galil asks if the 
MWM problem is P-complete |Gal86j . The best known lower bound is the CC- 
hardness result pointed out in |GHP95j (see for the definition of the class 

CC). On the other hand, the best known upper bound is the RNC algorithm of 
Karp, Upfal, and Wigderson | |KU WR6] for the special case where the weights on 
the edges of the input graph G are bounded from above by a fixed polynomial 
in the number of vertices in G. We call this case the polynomial-weight case of 
the MWM problem. 

From the practical point of view, Edmonds’ algorithm is too slow. Faster 
algorithms are required, and heuristic algorithms and approximation algorithms 
have been widely investigated. A survey of heuristic algorithms can be found 
in |Avi83j. and some approximation algorithms can be found in |Ven87j . In 
particular, a ^-approximation algorithm can be obtained by a greedy strategy 
that picks up the heaviest edge e and deletes e and its incident edges, and 
repeats this until the given graph becomes empty. Recently, Preis Einni gave 
a linear-time implementation of this greedy algorithm. 

For the MGM problem, several NC approximation algorithms are known (see 
| IKP,98| for comprehensive reference). In particular, Fischer et al. |FGHP93| sho- 
wed an NC algorithm that computes a matching with cardinality at least 1 — e 
times the maximum for any fixed constant e > 0. This NC approximation algo- 
rithm is based on essential properties of the MGM problem that do not belong 
to the MWM problem. 

Our first result is an RNC approximation scheme for the MWM problem, i.e., 
an RNC algorithm which computes a matching of weight at least 1 — e times the 
maximum for any fixed constant e > 0. This scheme uses the RNC algorithm 
for the polynomial-weight case of the MWM problem due to Karp et al. as a 
subroutine. 

Our second result is an NC approximation algorithm for the MWM problem. 
This algorithm can be viewed as a parallelized version of the greedy strategy. 
For any fixed constant e > 0, this algorithm computes a matching of weight at 
least 2 ^ times the maximum. The work done by the algorithm is optimal up to 
a polylogarithmic factor. 

In the results above, the size of the problem depends not only on the number 
of edges and vertices, but also on space to represent the edge weights. Thus 
the algorithms are not useful when the edge weights require a large amount of 
memories to represent. We next consider the algorithms that only use the total 
order of the weights, and they do not need to store each weight itself. 

Our third result is an NC approximation algorithm for the MWM problem. 
This algorithm computes a matching of weight at least timss the maximum 
weight, where A is the maximum degree of given graph. It runs in O(logn) time 
using n processors. 

Our fourth result is an RNC approximation algorithm for the MWM problem. 
This algorithm computes a matching whose expected weight is at least 2 S +4 
times the maximum weight. It runs in 0(log A) time using n processors. Remark 
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that the time complexity does not depend on the size of the graph; it can be 
performed efficiently in parallel even on a large scale distributed system. 

The rest of the paper is organized as follows. In section 0 we give basic 
definitions of the problem. The first and second algorithms are discussed in 
section 0 The third and fourth algorithms for graphs with heavy weights are 
stated in section 0 

2 Preliminaries 

We will deal only with graphs without loops or multiple edges. Let G = {V, E) 
be a graph. For each edge e of G, wcie) denotes the weight of e in G. We 
assume that each weight is positive, and it is different from other weights. For 
a subset F of edges of G, wg{F) denotes the total weight of the edges in F, 
i.e., wg{F) — X^eGF ^G(e). A maximal matching of C? is a matching of G that is 
not properly contained in another matching. A maximum weighted matching of 
G is a matching whose weight is maximized over all matchings of G. Note that 
a maximum weighted matching must be maximal. Let M be a matching and 
a= (ei, 62, • • • , 6;) be a path of length I in a graph. We call a an alternating path 
admitted by M if edges on a are alternately in M, that is, either {ei, 63, • • •} C M 
or {e2,64,---} C M. We sometimes unify the alternating path and the set of 
edges in the matching. 

The neighborhood of a vertex v in G, denoted by Ng{v), is the set of vertices 
in G adjacent to v. The degree of a vertex u in G is \Ng{v) \ , and denoted by 
deg(3(u). The maximum degree of a graph G is max„gy degQ(u), and denoted by 
Ag- Without loss of generality, we assume that Ag > 2 . The notations WG{e), 
deg(3(u), Ng{v), and Ag are sometimes denoted by just w{e), deg(t!), N{v), and 
Z\ if G is understood. For F C E, G[F] denotes the graph {V,F). 

The model of parallel computation used here is the PRIORITY PRAM where 
the processors operate synchronously and share a common memory, both simul- 
taneous reading and simultaneous writing to the same cell are allowed; in case 
of simultaneous writing, the processor with lowest index succeeds. An NC algo- 
rithm (respectively, RNC algorithm) is one that can be implemented to run in 
polylogarithmic time (respectively, expected polylogarithmic time) using a po- 
lynomial number of processors on a PRIORITY PRAM. More details about the 
model, NC algorithms and RNC algorithms can be found in l,lohil()IRPh()l . 

An algorithm A for the MWM problem achieves a ratio of p if for every graph 
G, the matching M found by A on input G has weight at least p-wg{M*), where 
M* is a maximum weighted matching of G. An RNC approximation scheme for 
the MWM problem is a family {Ai : i > 2 } of algorithms such that for any 
fixed integer i > 2 , Mi is an RNC algorithm and achieves a ratio of 1 — i. 
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3 Approximation Algorithms 

3.1 Common Preprocess 

Throughout the rest of this paper, let G be the input graph, M* be a maximum 
weighted matching of G, and n and m be the number of vertices and edges in 
G, respectively. Let Wmax be the maximum weight of an edge in G. Let fc be a 
fixed positive integer and u be a fixed positive real with 1 > cr > 0; the actual 
values of k and a will become clear later. 

For each edge e of G, we define an integer r(e) as follows: 

— If • wcie) < 1, then let r(e) = 0. 

■^max ^ ^ ' 

— Otherwise, let r(e) be the smallest integer i such that -WGie) < (l + cr)*. 

' ^ ^ "^max ^ ^ ^ ' 

We call r(e) the rank of e. Note that r(e) S {0, 1, • • • , [log^^^ fcn] }. 

Let Ei be the set of edges of G with rank i. Let M* = M* PiEi. The following 
lemma shows that the total weight of edges in Mg is significantly small. 

Lemma 1. X^eGM* '^G(e) < ^wg(M*). 

Proof. For each edge e e Eq, wa{e) < Thus, ^ 

^ “ □ 

Let G' be the graph obtained from G by deleting all edges of rank 0 and 
modifying the weight of each rest edge e to be (1 + 

3.2 RNC approximation scheme 

In this section, we present the RNC approximation scheme for the MWM pro- 
blem. It is based on the following result due to Karp et al. 

Lemma 2. [KUW86J There is an RNC algorithm for the polynomial-weight 
ease of the MWM problem. 



Theorem 1. There exists an RNC approximation scheme for the MWM pro- 
blem. 

Proof. Let £ be an integer larger than 1. Suppose we want to compute a matching 
M of G with wq{M) > (1 — j) • wc{M*). Then, we fix cr = ^ and k = 
and construct G' as in Section 13 . 1 1 Note that the weight of each edge of G' 
is polynomial in n. So, we use the RNC algorithm in Lemma |3 to compute a 
maximum weighted matching M' of G'. Note that M' is also a matching of G. 
It remains to show that wg{M') > (I — j)wg{M*). 

For each i > I, let M' = M' Cl Ei. Since M' is a maximum weighted mat- 
ching of G' and Uj>iM* is a matching of G', we have X)i>iSeGM* ^G'(e) < 
wg'{M') = = Y.i>i \Ml\{l + a)\ 
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On the other hand, since M* is a maximum weighted matching of G, we have 



i>l eGM' i>l 



since + cr)^ ^ < wg{g) if e € M[ with i > 1. 

I Afj I (l+fr)^ 

Thus, wg{M') = EecM' ^G(e) > ^ 

> Ei>l EeGM* ^G'(e) > Ei>l EeGM* 

_ wg{M*-M*) 

1 + (T 

Now, by Lemmad wg{M') > ^ ^ wg{M*). Clearly, 

(l — > 1— j. This completes the proof. □ 



3.3 NC algorithm 

In this section, we parallelize the greedy algorithm to obtain an NC approxima- 
tion algorithm for the MWM problem that achieves a ratio of for any fixed 
e > 0. 

The NC approximation algorithm will work on graph G' defined in Sec- 
tion Id. II Recall that each edge e of G' inherits a rank r(e) from G. Let £ = 
[log]^_i_£, kn \ . The highest rank is £. The algorithm works as follows: 

NC Algorithm 

1. For each i = £,£ — 1, ■ ■ ■ ,1, perform the following steps: 

1.1. Find a maximal matching Mi in G' = (V,Fi), where Fi is the set 
of edges of G' with rank i. 

1.2. Remove all edges e from G' such that e G Mi or e is incident to an 
endpoint of an edge in Mi. 

2. Output 

Let M = Ui<i<eMi. By Steps 1.1 and 1.2, M is clearly a maximal matching of 
G'. 

Lemma 3. wg{M) > wg{M* - M^). 

Proof. The idea is to distribute the weights of all edges of M* — Mq to the edges 
of M. Let e be an edge in M* — Mg. Let i = r(e). We distribute the weight 
WG{e) of e as follows: 

— Case (1): e G Mi. Then, we distribute WG{e) to e itself. 

— Case (2): e ^ Mi but one or both endpoints of e are incident to an edge of 

Mi. Then, we distribute WG{e) to an arbitrary edge e' G Mi such that e and 
e' share an endpoint. Note that < 1 -I- cr. 

— Case (3): None of cases (1) and (2) occurs. Then, by the algorithm, e must 

share an endpoint with at least one edge e' G Mj such that j > i. We 

distribute WG{e) to an arbitrary such edge e' . Note that WG{e) < (1 -I- cr)* < 
(1 -I- ay~^ < WG{e'). 



Parallel Approximation Algorithms 



89 



Consider an edge e' G M. Since M* — Mg is a matching, at most two edges 
e G M* — Mg can distribute their weights to e' . Moreover, by Cases (1) through 
(3), the total weight newly distributed to e' is at most 2(1 + <7)wg{&')- Thus, 
Ee'GM2(l+CT)wG(e') > wg{M*-M^). Consequently, ICG (M) > ,^^^^wg{M*~ 
Mg*). □ 

Thus we have the following theorem. 

Theorem 2. There is an NC approximation algorithm for the MWM problem 
that achieves a ratio of for any fixed e > 0. It runs in 0(log‘*n) time using 
n + m processors on the PRIORITY PRAM. 

Proof. Fix a positive real number e. Suppose we want to compute a matching 
M of G with wg{M) > 2 ^ ■ wg{M*). Then, we fix cr = | and A: = |"| + 1.5], 
construct G' as in Section EH and run the above NC algorithm on input G' to 
obtain a matching M. By Lemmas[Dand0, wg{M) > 2 ( 1 + 0 -) “ Jk) > 

We next analyze the complexity needed to compute M. G' can be constructed 
from G in 0(1) time using n + m processors. M can be computed from G' in 
0{logi_^_^{kn) ■ (log n + T(n,m)) time using max{(n + m), P(n, m)} processors 
on the PRIORITY PRAM, provided that a maximal matching of a given n- 
vertex m-edge graph can be computed in T(n,m) using P(n,m) processors on 
the PRIORITY PRAM. According to |IS86j . T{n,m) = log^n and P(n,m) = 
n + m. So, M can be computed in 0(log^ n) time using n + m processors on the 
PRIORITY PRAM. □ 

4 Approximation Algorithms for Graphs with Heavy 
Weights 

We next show two algorithms that only use the total order of the weights. The 
first one is an NC algorithm, and the second one is an RNC algorithm. Both 
algorithms contain three phases: 

Outline of Algorithms for heavy weights 

1. For given G, construct a heavy spanning forest F (defined later) of G. 

2. Construct a set of paths P in G[F]. 

3. Produce a matching in G[P]. 

The algorithms are the same except the phase 3. We now describe each phase, 
and analyze their complexity and approximation ratio. 



4.1 The First Phase 

The first phase contains two steps: 

1.1. In parallel, each vertex marks the heaviest edge incident to the vertex; 

1.2. F is the set of marked edges. 
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We first show that G[F] is acyclic. 

Proposition 1. G[F] is acyclic. 

Proof. Assume G[F] is not acyclic and Ci, 62, ■ ■ ■ ,Ci are edges producing a cycle 
in G[F]. We let Vi,V2, ■ ■ ■ ,vi be vertices on the cycle. If two consecutive vertices 
mark the same edge, there should be an edge on the cycle not marked by any 
vertices. Hence each vertex marks different edge. Thus we can assume that Vi 
marks Ct, and w{ei) < 10(62). However, this implies that w(ei) < 10(62) < ■ ■ ■ < 
w(6i) < w(ei), that is a contradiction. □ 

Thus, G[P] is a set of trees. Moreover, it is trivial that degg[p.](?;) > 0 for 
all V in V. Hence we call F heavy spanning forest of G. We now introduce some 
notions for the heavy spanning forest F. Let T be a tree in G[F], and nx be the 
number of vertices in T. Then, in the first step, each of nx vertices in T marks 
one edge, and T has nx — ^ edges. This implies that T has exactly one edge 
marked by its both endpoints. We call the edge and two vertices a root edge and 
two roots of T, respectively. That is, each tree has one root edge and two roots. 
We can show the following lemma by a simple induction. 

Lemma 4. Let T he a tree in F, and Cr be the root edge of T. Then for any 
leaf-root path (ei, e\, 62 , ■ ■ ■, e^) in T, w(ei) < w(e\) < w(c 2 ) < ■ ■ ■ < w(er). 

That is, the root edge is the heaviest edge in the tree. We now show the theorem 
for the relation between w(M*) and w(F). 

Theorem 3. w(F) > w(M*). 

Proof. Let e = {m, u} be an edge in M*, but not in F. Then, since e ^ F, 
e is not marked by both u and v. Let and be edges marked by u and 
V, respectively. Since M* is a matching, neither e„ nor e„ is not in M*. That 
is, {eu,ey} C F — M* . Now we divide the weight w(e) in two weights ^w(e), 
and distribute them to e„ and e„, respectively. Since e is not marked, w(e) < 
w(eu),w(ey). Moreover, e„ and are not distributed by the other edges in M* 
at the points u and v since e is an element in the matching M*. That is, no edge 
e' in F — M* will be distributed more than w(e') by the edges in M* — F. Thus 
each edge in M* is either in F or it can be divided and distributed to two edges 
in F — M*. This implies that w(F) > w(M*). □ 

We moreover analyze the proof of Theorem 0in detail. Let G = F fl M*, 
F = F — G , and M = M* — C. We let R be the set of the root edges of F. Then 
we have the following corollary. 

Corollary 1. w(F) > 2w(M*) — w(G) — w(R). 

Proof. In the proof of Theorem El each weight of an edge in M is divided and 
distributed to two edges in F, because corresponding edges in F can be distri- 
buted at both endpoints. However, only root edges can be distributed at both 
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endpoints. Now we distribute each weight of an edge in M onto two edges in F 
without division. In the case, root edges may be distributed twice, hence we have 
w{F) > 2w{M) — w{R). Thus w{F) = w{F) + w{C) > 2w{M) — w{R) + w{C) = 
2w{M* -C)- w{R) + w{C) = 2w{M*) - w{C) - w{R). □ 

4.2 The Second Phase 

2.1. In parallel, each vertex v with deggj^j (v) > 2 deletes all edges incident to 
V except two heaviest edges. Let P be the set of remaining edges. (Com- 
ment: It is easy to see that G[P] is a set of paths, and degg[p](?;) > 0 still 
holds if V was not a leaf in F.) 

We show a proposition and a lemma for trees, and main theorem in this subsec- 
tion. 

Proposition 2. Let T he a tree with n vertices. We let rii he the number of 
vertices of degree i with I < i < At- Then (a) (^) = 

2(n- 1). 

Proof, (a) is trivial. When each vertex counts up its degree, each edge is counted 
exactly twice. Since any tree with n vertices has n — 1 edges Chapter 4], 

(b) follows. □ 



Lemma 5. Let L{n, A) he the maximum number of leaves in a tree of maximum 
degree A with n vertices. Then L{n, A) < • 

Proof. Let T be an n vertex tree with L{n, A) leaves. Let Vj be the set of 
internal vertex of T. Then, the graph induced by Vj is also a tree. The induced 
tree contains | Vj \ vertices and \ Vi \ — 1 edges. Each leaf of T is incident to one 
internal vertex. Moreover, each vertex in Vi can be incident to at most A vertices. 
Thus we have L{n, A) < | V7 1 — 2( | V/ 1 — 1) . We also have L{n, A)+ \ Vi\ = n. 

Hence L(n, Z\) < C 



Theorem 4. 



Proof. We first assume that G[F] contains only one tree. We let Ui be the number 
of vertices of degree i in G[E] with 1 < i < Z\. Each vertex does not delete the 
nearest edge to the root edge. Thus no edge will be deleted by two different 
vertices. Hence, by Proposition |21 the number of edges deleted in step 2.1 is 

iui — 2 = 2(n — 1) — rii — 2 u 2 — 2{n — 



equal to “ 2)rii = 



n\ — U 2 ) = rxi — 2. Using Lemma El we have 



P 



F 



n— 1— ni+2 

n—1 



> 



n~\-l — L(n,A) 
n—1 



> 



{A-l)(n+l)-{A-2)n+2 _ n+A+1 _ 1 , A+2 ^ 1 

(ri-l)(Zi-l) “ (n-l)(zl-l) ~ A-1 w (rt-l)(Zi-l) ^ A-1' 

We then consider the weights of deleted edges. For each deleted edge e, there 
exists at least one edge e' in P with w(e') > w{e). On the other hand, for each 



92 



R. Uehara and Z.-Z. Chen 



remaining edge e' in P, it is corresponded by such deleted edges e at most Z\ — 1 

^ ^ ^ ' 2^ implies that i 



times. This together with 



> 



> 



1^1 ^ A-l — m— ^(p) '' A-1- 

When G[F] contains two or more trees, the discussion above can be applied 
on each tree. More precisely, let F contain k trees Ti,T2, ■ ■ ■ ,T^, and Pi be 
the path set obtained from Ti with 1 < i < k. Using the discussion above, we 



have 



»(U) 

w(Ti) 



> 



2^, consequently, w{Pi) > with 1 < i < k. Thus w{P) = 



JlLiMPi) > = ^w{F), consequently, ^ 



(F) 



4.3 The Third Phase 
NC Algorithm 

We define the distance of edges to describe the third phase of NC algorithm. Let 
a = (ei, 62, • • • , Ci) be a path of length 1. Then the distance of from ei on a, 
denoted by D{ei, ei), is defined by D{ei,ei) = 0, and D{ei, ei) = D{ei-i,ei) + 1 
for 1 < t < L The third phase of NC algorithm contains the following steps: 

3.1. In parallel, find the heaviest edge in each path in P. 

3.2 . Mat is the set of edges having even distance from the heaviest edge on 
the same path. 

As a result, each path in P becomes alternating path containing the heaviest 
edge on the path admitted by Mat. We here show a proposition and a useful 
lemma for paths with special properties. 

Proposition 3. Let a = (ei, 62, • • • , ei) be a path with w{ei) > w{e2) > ■ ■ ■ > 
w{ei). Then the maximum weighted matching of a, say Ma, is either {ei, 63, • • • , 
6/} for odd I, or {ei, 63, • • • , e;_i} for even 1 . Moreover, w{Ma) > ^w{a). 

Proof. When I is even, considering w(ei) > w(e2), w^e^) > 16(64), • • •, wi^i-i) > 
w(ei), we immediately have the proposition. When I is odd, the last edge 6/ just 
increases the weight of Mq,. □ 



Lemma 6. Let a = (ci, 62, • • • , e;) be a path such that w{e\) < 1^(62) < • • • < 

w{ei-i) < w{ei) > w{ei+i) > • • • > w{ei) for some i with 1 < i < 1 . Let Ai be 

the alternating path containing Ci, A2 be the other alternating path, and be 
the maximum weighted matching of a. 

( 1 ) Either Ai = or A2 = Ma- (Hence w{Mc) > ^w{a).) 

( 2 ) When A2 = Mq,, ui(6j_i) + w{ei+i) > w{ei). 

( 3 ) w{Ai) > |i6(a). 

Proof. ( 1 ) We first show that Ma satisfies either (a) Ci G Ma or (b) {cj-i, 6^+1} C 
Ma. To derive a contradiction, assume that Ci ^ Ma, Ci-i ^ Ma, and 6^+1 G Ma. 
Then {Ma — {ci+i}) U{6i} is a matching heavier than Ma since w{ei) > w{ei+i), 
that is a contradiction. The other symmetric case {ci ^ Ma, 6i_i G Ma, and 
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Ci+i ^ Ma) can be shown by the same argument. In the case (a), we have 
€i S Ma, €i-i ^ Mq,, and e^+i ^ Mq,. Then we can consider a as two pa- 
ths (ei, 62, • • • , 61-2) and (ei+2, ■ ■ ■ ,ei) separately. Using Proposition 01 we have 
Ma = {ej U {ei_2,6i_4, • • •} U {ei+2,e*+4, • • •}, consequently, = Ai. In 
the case (b), we have ^ Ma, 6i_i G Ma, and e^+i G Ma- Thus we have 
Ma = {ei_i,ei+i}U{ei_3,6i_5,---}U{ei+3,ei+5,---}, consequently, Ma = A2. 

( 2 ) Let A'2 be {A2 — {ei-i,ei+i}) U {e^}. Then A'2 is a matching. Since A2 is 
the maximum weighted matching, w{A'2) < w{A2). This implies that w(ei-i) 
w(ei+i) > w(ei). 

( 3 ) By ( 1 ), either Ai = Ma or A2 = Ma- When Ai = Ma, the claim follows 
from Proposition 0 Thus we assume that A2 = Ma - 

We here consider two paths ai = (ci, 62, • • • , 6i_i, Cj) and 02 = (si, e^+i, • • • , 
6j). Let A\ (and Af) be the alternating path of a\ (and 02, resp.) containing e^. 
That is, A\ is the former half of A^, is the latter half of Ai, and AldAf = {ci}- 
Then, by Proposition El rc(A() > ^w{ai) and w{A\) > |w(a2). 

Thus, w(Ai) = w{A\ U A\) = w(A}) -I- w{Af) — w{A\ n A\) > ^{w{ai) -I- 
w{a 2 ))-w{ei) = \{w{a)+w{ei))-w{ei) = \{w{a)-w{ei)) > i(w(a) -w(Ai)). 
This implies that w(Ai) > |w(o!). □ 

We here remark that, in LemmaEKl), we cannot determine which alternating 
path is heavier in general. (For example, a path (ci, 62, 63) has different answers 
when w{e\) = l,w{e2) = 3 , 16(63) = 1 and ty(6i) = 2,w{e2) = 3 , 10(63) = 2 ). 
We also remark that Lemma 01) does not hold for general weighted path (for 
example, each alternating path of (64,62,63,64) is not the maximum weighted 
matching for w(ei) = 5 , 10(62) = l,w(63) = 1 , 66(64) = 5 ). 

We now show the relation between w(Miy) and w(P)- 

Lemma 7. w(Mff) > ^w(P)- 

Proof- We first observe the following claim: in step 2 . 1 , if vertex v delete an edge 
{u, u}, the edge was marked by u in step 1 . 1 . This is easy because each vertex 
marked the heaviest edge in step 1.1, and remains the heaviest edge(s) in step 
2 . 1 . Using the claim and simple induction, we can show that each path in P is 
either 

( 1 ) a part of some leaf-root path in some tree in F; or 

( 2 ) two leaf-root paths connected by the root edge in a tree in F- 

In the case ( 1 ), combining Lemma Sand Proposition S AL/v contains the ma- 
ximum weighted matching of the path, and that has at least half weight of 
the path. Thus it is sufficient to show for the case ( 2 ). By Lemma S the 
path a = (ei, 62, • • • , 6j) satisfies that w(6i) < 10(62) < ■■■ < w(6r-i) < 
w(er) > w(6r+i) > ■■■ > w(6i), where 6r is the root edge. Thus, according 
to Lemma 03 ), for the alternating path Ar containing 6r, w(Ar) > |i6(a). 
Thus w(Mjq) > ^w(P)- □ 
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Combining Theorem 0 Theorem 0 and Lemma 0 we can show that the NC 
algorithm is a 3 (h'- 1 ) -approximation algorithm. But the better approximation 
ratio 3 ^^ will be stated in Section 

RNC Algorithm 

Phases 1 and 2 are performed “locally” . That is, all computations can be perfor- 
med within the neighbors. Using randomization, RNC algorithm finds a matching 
in G[P] locally. The third phase of the RNC algorithm contains the following 
steps: 

3.1’. In parallel, each vertex randomly choose one of two edges incident to the 
vertex in G[P]. (The vertices of degree one choose the unique edge incident 
to the vertex.) 

3.2’. AI]i is the set of edges chosen by both endpoints. 

Since each vertex choose one edge, the resulting M/j is a matching. Moreover, 
since each edge in P is chosen with probability at least |, we immediately have 
the following lemma. 

Lemma 8. The expected value ofw(Mfi) is at least ^w{P). 

4.4 Complexity of Algorithms 

Each algorithm uses n processors; every vertex in G has a processor associated 
with it. As the input representation of G, we assume that each vertex has a list 
of the edges incident to it. Thus, each edge {i,j} has two copies - one in the 
edge list for vertex i and the other in the edge list for vertex j. 

Theorem 5. The NC algorithm runs in O(logn) time using n processors on the 
PRIORITY PRAM. The algorithm only requires the total order of the weights. 

Proof. Each processor uses two memory cells to store the edges in P. The first 
and second phases can be efficiently implemented modifying as follows: 

1.1’. In parallel, each vertex v finds the heaviest edge e = incident to 

v; 

1.2’. In parallel, v stores the first cell of v with e. 

2.1’. In parallel, each vertex v checks the contents of the first cell of u. If it 
is e, then the process is end. If it is not e, v tries to store the second cell 
of u with e. This trial will succeed if w{e) is the heaviest among the other 
edges that are tried to store the same cell. 

The step 1.2’ can be done in a unit time. Moreover, it is not difficult to see that 
the steps 1.1’ and 2.1’ can be done in 0(logZ\) time using standard technique 
with comparison operation. 
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In the third phase, we can easy to see the following: 

(1) if e = {u, t;} is a root edge, e is stored in the first cells of both u and v] and 

(2) otherwise, e is stored in the first cell of one endpoint, and in the second cell 
of the other. 

Moreover, each non-root edge knows which endpoint is close to the root edge; 
the endpoint storing the second cell with the edge. Thus step 3.1 can be done in 
0(1) time, and step 3.2 can be done in O(logn) time using standard list ranking 
technique (see e.g., IKK9UI 1. 

Throughout the computation, the algorithm only compares two weights of 
edges. Thus the algorithm only requires to know the total order of the weights. 

□ 

The third phase of the RNC algorithm can be performed in 0(1) time. This 
immediately implies the following theorem. 

Theorem 6. The RNC approximation algorithm runs in 0(logZ\) time using 
n processors on the PRIORITY PRAM. The algorithm only requires the total 
order of the weights. 

4.5 Approximation Ratios 

We remind that M* is a maximum weighted matching, F is the heavy spanning 
forest, R is the set of the root edges of F, and P is a set of paths obtained in 
step 2.1. Moreover we let O = P Cl M*, F = F — C, and M = M* — C. 

To derive good approximation ratios, we define two maximum matchings: Mp 
denotes a maximum weighted matching of G[P], and Mp denotes a maximum 
weighted matching of G[F]. 

Lemma 9. w{Mp) > 2(zi-i) W(P). 

Proof. As seen in the proof of Lemma Q each path in P is either 

(1) a part of some leaf-root path in some tree in P; or 

(2) two leaf-root paths connected by a root edge in a tree in F. 

For each path, by Lemma 01), Mp contains heavier alternating path that has 
at least half weight of the path. This together with Theorem E] implies that 
w{Mp) > lw{P) > 2 (i_i) W(P). □ 

Lemma 10. w{Mp) > ^w{M*). 

Proof. We first remind that Mp is the maximum weighted matching in P. Thus, 
since G is a matching in P, w{Mp) > w{C). It is easy to see that P is a matching 
in P. This implies that w{Mp) > w{R). It is also easy to see that Mp is a 
matching in P, and thus w{Mp) > w{Mp). Hence, combining Corollary ^ we 
have w{F) > 2w{M*) —w{C) — w{R) > 2w{M*) — 2w{Mp). On the other hand, 
by Lemma El w{Mp) > w{Mp) > 2 (/^^-i) '^(^)- Combining the equations, we 
have (2(Z\ — 1) -I- 2)w{Mp) > 2w{M*), consequently, w{Mp) > ^w{M*). □ 
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Lemma 11. w{C) < w{Mp) < 2w{Mp). 

Proof. It is clear that w{C) < w{Mp). Thus we show w{Mp) < 2w{Mp). We 
are going to show that the weight of each edge in Mp can be distributed to an 
edge in Mp, and the weight of each edge in Mp is distributed by such edges at 
most twice. Let e = {u, u} be any edge in Mp. Then three cases occur according 
to e. 

(1) e € Mp. We distribute w{e) to itself. 

(2) e G P — Mp. We first assume that e is not a root edge. We assume that u 
is closer to the root edge than v on G[F]. In the case, e is incident to e' in P at 
the vertex u with w{e') > w(e). Thus we distribute w(e) to e'. We next assume 
that e is a root edge. That is, e is a root edge not in the maximum weighted 
matching of G\P]. Then, by Lemma 0 Mp contains two edges e' and e" such 
that e' and e" are the edges incident to e at vertex u and v, respectively, and 
w{e') + w{e'') > w{e). Thus we divide w{e) into w{e') and w(e) — w(e')(< w(e")), 
and distribute them to e' and e", respectively. 

(3) e ^ Mp. We assume that u is closer to the root edge than v on G[F]. In 
the case, e was deleted by u in step 2.1. The vertex u remains two edges e' and 
e" in P with w{e'),w{e”) > w{e). Moreover, either e' or e" is in Mp. Thus we 
distribute w{e) to the edge in Mp. 

Since Mp is a matching, no two edges are distributed at the same endpoint. 
Thus each edge in Mp is distributed at most twice at both endpoints. This 
implies that w{Mp) < 2w{Mp). □ 

Theorem 7. w(Mp) > 

Proof. Combining Corollary 0 and Lemma □, we get w{Mp) > ^ 

2{A-i) (2w(M*) — w{R) — w{G)). Using Lemma [TTl we have 2w{Mp) > w(G). 
On the other hand, since R C M^, w{Mp) > w{Mp[) > w{R). Thus, w{Mp) > 
2{A-i) (2w(M*) — w{R) — w(C)) > 2(A-i) (2w(M*) — w{Mp) — 2w{Mp)). Thus 

Theorem 8 . The approximation ratio of the NC algorithm is 3 ^^^. 

Proof. We first show that w(Mat) > ^w{Mp). As seen in the proof of Lemma 
|7| each path in P is either 

(1) a part of some leaf-root path in some tree in F; or 

(2) two leaf-root paths connected by a root edge in a tree in F. 

In the case (1), both Mn and Mp contain the same alternating path that con- 
tains the heaviest edge. We consider the paths in the case (2). Let a be the path 
in P, and Ai be the alternating path of a containing the root edge, and A 2 be 
the other alternating path. According to Lemma 0 Ai or A 2 is the maximum 
weighted matching of a. When Ai is the maximum weighted matching, both Mjy 
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and Mp contain it. Now we assume that A 2 is the maximum weighted matching 
of a. Then, by Lemma 0(3), 'w(tIi) > consequently, w{A 2 ) < |w(a). 

Thus ic(Ai) > ^w{A2). Therefore, in any cases, w{Mm) > \w{Mp). 

Combining Corollary Theorem 2J and Lemma 0 we have w{Mpf) > ^w{P) 
— ) > 3 (^^_i) {2w{M*)—w{C)—w{R)). It is clear that w{Mm) >w{R) 

since R C Mj^. Thus, using Lemma 1771 we have w{Mpi) > — 

w{C) - w{R)) > 3(2_i) (2w(M*) - 2w{Mp) - w{Mn)) > 3 (i_i) (2w(M*) - 
5w{Mp[)), consequently, w{Mpi) > □ 



Theorem 9 . The approximation ratio of the RNC algorithm is 23+4- 

Proof. Using Corollary 0 Theorem 0 and Lemma 0 we have 
E{w{Mr)) > 4(21-1) w(J^) > 4(i_i) (2w(M*) - w{C) - w{R)). 

We now compare w{Mp) with w{Mp). Each edge in Mp appears in Mp 
with probability at least | . This implies that the expected value of w{Mp) is at 
least \w{Mp). Thus, using Lemma ITT1 we have E{w{Mp)) > -^^-^^{2w{M*) — 
w{C)-w{R)) > 4(^(2rc(M*)-3u;(Mp)) > ^^^{2w{M*)-l2E{w{Mp))), 
consequently, E{w{Mp)) > 221^4 w(M*). □ 
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It Is on the Boundary: 

Complexity Considerations for Polynomial Ideals 
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Abstract. Systems of (in general non-linear but) polynomial equations 
over some ring or field R occur in numerous situations when dealing with 
problems in modelling, simulation, geometric representation and analy- 
sis, deduction, symbolic algebra, or dynamical systems, to name just a 
few. Algebraic varieties are determined by such systems (say over the 
reals or the complex number field), but they have also uses for proposi- 
tional proof systems or for the modelling of certain parallel processes (as 
with reversible Petri nets). 

Many, if not most of the questions concerning such systems of polynomial 
equations reduce to the investigation of properties of polynomial ideals 
in multi- variate polynomial rings (like Q[a;i, . . . ,x„] or Z2[xi, . . . ,Xn]), 
and earlier results have shown that such questions are generally very 
hard in the algorithmic sense, namely complete for the complexity class 
EXPSPACE. 

As it turns out the algorithmic complexity of questions about polynomial 
ideals is really determined (as one might expect, of course) by the pro- 
perties of the sets of exponent vectors occurring in the non-zero terms of 
the polynomials, and thus by the properties of seemingly well-structured 
subsets of N", the nonnegative orthant of Z". The algorithmic properties 
of such subsets have been studied extensively in numerous areas, as men- 
tioned above, and their complexity has consistently been misjudged, ba- 
sed on the fact that “far out”, i.e., for sufficiently large values of the co- 
ordinates, most of the relevant algorithmic problems become easy {NP 
or even P). 

Here, we discuss how properties at the boundary of N" (to Z") affect 
the algorithmic complexity of basic questions about polynomial ideals. 
We also present some new algorithmic and complexity theoretic results 
based on ideal dimension and other structural properties, like for toric 
ideals. 
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Abstract. We present an efficient parallel algorithm for scheduling n 
unit length tasks on m identical processors when the precedence graphs 
are interval orders. Our algorithm requires 0(log^ v + (nlogn)/u) time 
and 0{nv^ + n^) operations on the CREW PRAM, where u < n is a 
parameter. By choosing v = y/n, we obtain an 0(-y/n log n)-time algo- 
rithm with 0{n^) operations. For v = n/logn, we have an O(log^n)- 
time algorithm with 0(n^/log^n) operations. The previous solution ta- 
kes 0(log^ n) time with 0(n^ log^ n) operations on the CREW PRAM. 
Our improvement is mainly due to a reduction of the m-processor schedu- 
ling problem for interval orders to that of finding a maximum matching 
in a convex bipartite graph. 



1 Introduction 



The m-processor scheduling problem for a precedence graph G is defined as 
follows. An input graph G has n vertices each of which represents a task to be 
executed on any one of m identical processors. Each task requires exactly one 
unit of execution time on any processor. At any timestep at most one task can 
be executed by a processor. If there is a directed edge from task t to task t', then 
task t must be completed before task t' is started. An m-processor schedule for 
G specifies the timestep and the processor on which each task is to be executed. 
The length of a schedule is the number of timesteps in it. A solution to the 
problem is an optimal (i.e., shortest length) schedule for G. 

The m-processor scheduling problem for arbitrary precedence graphs has 
been studied extensively. When m = 2, there are polynomial-time algorithms for 
the problem [6I3I9I/] . and when m is part of the input, the problem is known to be 
NP-hard j2(Jj . When m is part of the input, several researchers have considered 
restrictions on the precedence graphs. Polynomial-time algorithms for the m- 
processor scheduling problem are known for the cases that the precedence graphs 

* This work was supported by the Brain Korea 21 Project. 
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are trees m and interval orders [E|. A survey of results on other special cases 
of the problem can be found in m 

In parallel computation, the two processor case has been studied mostly. 
When m = 2, Helmbold and Mayr El gave the first NC algorithm and Vazi- 
rani and Vazirani ED presented an RNC algorithm. Jung, Serna and Spirakis 
m developed an 0(log^ n)-time algorithm using 0{n^ log^ n) operations on the 
CREW PRAM. When m = 2 and the precedence graphs are interval orders, 
Moitra and Johnson m and Chung, Park and Cho [3 gave NC algorithms, and 
the one in |2] requires 0(log^ v + (nlogn)/u) time and 0{nv‘^ + n?) operations 
on the CREW PRAM, where u < n is a parameter. 

When m is part of the input and the precedence graphs are interval or- 
ders, Sunder and He El developed the first NC algorithm for the scheduling 
problem, which takes 0(log^ n) time using 0(n®log^n) operations or O(log^n) 
time using O(n^log^n) operations on the priority CRCW PRAM. Mayr jTl] 
gave an 0(log^ n)-time algorithm using O(n^log^n) operations on the CREW 
PRAM. 

In this paper, we present an efficient parallel algorithm for the m-processor 
scheduling problem when the precedence graphs are interval orders. Our algo- 
rithm takes 0(log^ u -I- {n\ogn)/v) time using 0{nv^ + -n?) operations on the 
CREW PRAM, where u < n is a parameter. By choosing v = y/n, we obtain an 
0(-v/n log n)-time algorithm with 0{'n?) operations. For v = n/logn, we have an 
0(log^ n)-time algorithm with 0(n^/log^n) operations. 

We briefly compare Mayr’s algorithm and ours. A parallel algorithm that 
computes the length of an optimal m-processor schedule for an interval order 
will be called an m-LOS algorithm. Mayr’s algorithm basically consists of two 
parts. The first part uses an m-LOS algorithm to compute the lengths of opti- 
mal schedules, which takes 0(log^ n) time using 0{n^ log^ n) operations on the 
CREW PRAM. The second part computes an actual scheduling, which takes 
O(log^n) time using O(n^log^n) operations on the CREW PRAM. Our algo- 
rithm also consists of two parts and its first part is an m-LOS algorithm, but 
our algorithm is quite different from Mayr’s as follows. 

— We give an efficient m-LOS algorithm that takes 0(log^u -I- (nlogn)/u) 
time and 0{nv^ + n^) operations on the CREW PRAM by generalizing the 
techniques used for two-processor scheduling in EJ. 

— After computing the lengths of optimal schedules, we reduce the m-processor 
scheduling problem for interval orders to that of finding a maximum mat- 
ching in a convex bipartite graph using the lengths to compute an actual 
scheduling. Therefore, the part of computing an actual scheduling in our 
algorithm takes O(log^n) time using 0(n log^n) operations on the EREW 
PRAM. 

The remainder of this paper is organized as follows. The next section gives 
basic definitions and a sequential scheduling algorithm. Section 3 describes the 
reduction of m-processor scheduling to maximum matching in a convex bipartite 
graph. Section 4 describes our efficient m-LOS algorithm. 
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2 Basic Definitions and Sequential Algorithm 

In this section we describe basic definitions and a sequential m-processor sche- 
duling algorithm. An instance of the m-processor scheduling problem is given by 
a precedence graph G = (V,E). A precedence graph is an acyclic and transitively 
closed digraph. Each vertex of G represents a task whose execution requires unit 
time on one of m identical processors. If there is a directed edge from task t to 
task t', then task t must be completed before task t' is started. In such a case, 
we call t a predecessor of t' and t' a successor of t. We use {t, t') to denote a 
directed edge from t to t' . A schedule is a mapping from tasks to timesteps such 
that at most m tasks are mapped to each timestep and for every edge (t, t'), t is 
mapped to an earlier timestep than t' . The length of a schedule is the number 
of timesteps used. An optimal schedule is one with the shortest length. 

Let I = {/i, be a set of intervals with each interval A represented by 

li-l and li.r, where A.l and A.r denote the left and right endpoints of interval 
li, respectively. Without loss of generality, we assume that all the endpoints are 
distinct. We also assume that the intervals are labeled in the increasing order 
of right endpoints, i.e., Ii.r < l 2 -r < ■ ■ ■ < In.r because sorting can be done in 
O(logn) time using 0(n log n) operations on the EREW PRAM Given a set 
/ of n intervals, let Gi = (P, E) be a graph such that 

- V = I = {h, / 2 , ■ ■ ■ , In} and 

— E = {{Ii,Ij) I 1 < i,j < n and Ii.r < Ij.l}. 



Such a graph Gi is called an interval order. Note that Gj is a precedence 
graph. Given a set / of n intervals, the interval graph Gi is an undirected graph 
such that each vertex corresponds to an interval in I and two vertices are adja- 
cent whenever the corresponding intervals have at least one point in common. 
Therefore, an interval graph G/ is a complement of the interval order Gi. We say 
that two vertices are independent if they are not adjacent in a graph. Note that 
overlapping intervals are adjacent in G/ and they are independent of each other 
in Gi. In what follows, we use the words tasks and intervals interchangeably. 

A schedule of length r on m processors for an interval order Gi can be 
represented by an to x r matrix M, where the columns are indexed by 1, . . . , r 
and the rows are indexed by 1, . . . , to. Let Pi, ... , denote the to identical 
processors. If task x is scheduled on processor Pi at timestep r, then x is assigned 
to a slot M[i,r]. No two tasks are assigned to the same slot in M. A slot of M 
to which no task is assigned is said to have an empty task. We assume that the 
right endpoint of an empty task is larger than all right endpoints in I. A column 
of M is called full if it does not have an empty task. Let opt{I) be the length of 
an optimal schedule for an interval order Gj. 

Algorithm m-seq(/,TO) 

Input: intervals in I 
Output: TO X opt{I) matrix Ms 



An Efficient Parallel Algorithm for Schednling Interval Ordered Tasks 



103 



begin 

r ^ 1; 

Sr f— the list of intervals in I sorted in the increasing order of right endpoints; 

while St- ^ ^ do 

^ {}; 

Extract the first interval from Sr and insert it to S'] 

repeat 

Scan Sr from left to right. When interval w is scanned, 
if w is overlapping every interval in S' 
then extract w from Sr and insert it to S" fi; 
until {S' contains m intervals or all intervals of Sr are considered) 
Schedule the intervals of S' in column r of Mg 
in the order of the elements in list S'] 

Sr+l Sr] 
r ^ T + 1; 
od 

Output the schedule Mg constructed; 

end 



Fig. 1. Sequential scheduling algorithm 



The sequential algorithm [ini in Figure n solves the m-processor scheduling 
problem for an interval order Gj, which runs in O(nlogn) time. Let d(l,j) 
denote {Ii, ■ ■ ■ , Ij}, ^ < j < n. Note that m-seq computes an optimal schedule 
for G/(i,j). We can easily get the following facts from algorithm m-seq. 

Fact 1 All the intervals in the same column of Mg overlap each others. 



Fact 2 In each column r of Mg in m-seq, Ms[l,r].r < Mg[2,T].r < ... < 
Mg[m, r].r. 

Fact 3 In the first row of Mg, Mg[l, l].r < Mg[l, 2].r < . . . < Mg[l, m].r. 

Proof. It follows from the fact that for every r, Mg [1, t] is the first ending interval 
in Sr and Ms[l,r'] with r' > r is in Sr. 

3 Constructing an Optimal Schedule 

In this section we describe our parallel m-processor scheduling algorithm for 
interval orders. We first describe characteristics of maximal cliques in interval 
graphs. A set of intervals form a clique if each pair of intervals in the set has a 
nonempty intersection. If we scan any given interval x from its left endpoint to 
its right, we can meet all those maximal cliques to which x belongs. This yields 
the Gilmore-Hoffman theorem m- 
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(Ci;^ C2 C3 C5 C6 C7 C9 

{a,f} {d,f,h,j} {k,l,n} 




Fig. 2. An interval set I and Gj’s optimal schedule when m = 3. 



Theorem 1. pm The maximal cliques of an interval graph can be linearly or- 
dered so that for any given interval x, the set of cliques in which x occurs appear 
consecutively in the linear order. 

Let k be the number of maximal cliques in Gj. Let Ci, . . . , Cfc be the maximal 
cliques of Gj in the ordering of Theorem ^ Given an interval set, we can find 
the maximal cliques of the interval graph Gj using Lemma ^ In Figure | 2 | dotted 
vertical lines mark the right endpoints of Lemma ^ i.e., there are nine maximal 
cliques in Gj and they are Ci = {a, /}, C2 = {&, /}, G3 = {c, d, /, j}, etc. 

Lemma 1. | 2 | In an interval set I , a right endpoint represents a maximal clique 
of Gj if and only if its previous endpoint in the sorted list of left and right 
endpoints is a left endpoint. 

For each interval x G I, let Sx and lx be the smallest and the largest j, respec- 
tively, such that X belongs to Gj. In Figure El = 4 and Ih — 6 because interval 
h is in Ga,C^ and Gq. Let sltask{i, j), 1 < i,j < k, be the set of intervals x such 
that i < Sx and lx < j. In FigureEl sltask{l^b) = {a, 6, c, d, e, /}. Note that al- 
gorithm m-seq(/, m) in Figure Q computes an optimal schedule for Gsitask(i,j), 
1 < j < because m-seq computes an optimal schedule for 1 < t < n, 

and maximal cliques G\, . . . ,Ck of G/ are labeled by scanning endpoints of I 
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from left to right using Lemma^ Let len{i,j) be the minimum number of time- 
steps required to schedule all tasks in sltask{i, j), i.e., opt{sltask{i, j)) . 

Lemma 2. |2| For two intervals x,y € I, lx < Sy if and only if x.r < y.l. 

We now describe our parallel m-processor scheduling algorithm for interval 
orders. Our algorithm consists of two parts. The first part is an to-LOS algorithm 
m-length, which will be described in Sectional Algorithm m-length computes 
len(l,j) for all 1 < j < fc. The second part computes an optimal schedule by 
reducing the m-processor scheduling problem for an interval order to that of 
finding a maximum matching in a convex bipartite graph. 

We first describe the definition of a convex bipartite graph. A convex bi- 
partite graph G is a triple (A, B, E) such that A = {01,02 , . . . , o„} and B = 
{ 61 , 62 , . . . , &m} are disjoint sets of vertices and the edge set E satisfies the fol- 
lowing properties: 

(1) Every edge of E is of the form (ai,bj). 

(2) If (oi, bj) G E and (oi, 6j+t) G E, then (oi, bj+r) G E for every 1 < r < t. 

Property (1) is a bipartite property while property (2) is a convexity pro- 
perty. It is clear that every convex bipartite graph G = (A,B,E), where A = 
{oi,...,o„} and B = {bi,...,bm}, is uniquely represented by a set of tri- 
ples: T = {(oi, gi,hi) I 1 < i < n}, where gi = minjj | (ai,bj) G E} and 
hi = max{j I (ai,bj) G E}. Dekel and Sahni |5] developed an 0(log^ n)-time 
convex bipartite maximum matching algorithm using 0(n log^ n) operations on 
the EREW PRAM. 

Our m-processor scheduling algorithm is as follows. 

Algorithm m-schedule 

— Step 1: Compute Sx and lx for every x G I. 

— Step 2: Let Lg = 0. Let Lj = len{l,j) for 1 < j < k and compute Lj. 

— Step 3: Construct a convex bipartite graph Gb = {Ab, Bb, Eb), where Ab = /, 
Bb = {1,2,..., mLk} and Eb is computed from Lj, j < k, as follows. If an 
interval a: G / is in a maximal clique Gt in G/, then x is adjacent to all j in Bb 
such that mLt-i -I- 1 < J < mLt. Since an interval x is in every Gt such that 
Sa; < t < G by Theorem d Gb is represented by T = {(a;, mLs^-i + 1, mLi^) \ 
X G /}. 

— Step 4: Find a maximum matching in Gb- Then an optimal schedule for 
Gi is represented by an m x Lk matrix Mb, whose j-th column consists of 
the tasks in Ab matched with m(j — 1) -1- 1 , ... , mj in Bb in the maximum 
matching of Gb- 

We now prove the correctness of algorithm m-schedule. 

Lemma 3. All the intervals in the same eolumn of Mb are independent of eaeh 
other in Gi- 
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Proof. By definition of Gb, all intervals that are adjacent to one of mLj-i + 
1, . . . , mLj in Gb, 1 < J < Lk, are also adjacent to all of mLj_i + 1, . . . , mLj and 
they are all in the same maximal clique in Gj. Therefore, the intervals matched 
with mLj -I + 1, . . . ,mLj in the maximum matching of Gb are independent of 
each other in Gj. Since all the intervals in columns Lj-i + 1, . . . , Lj, 1 < j < k, 
in Mb are independent of each other in Gj, we have the lemma. 



Lemma 4. The convex bipartite graph Gb = {Ab, Bb, Eb) has a maximum mat- 
ching of size n, i.e., all intervals in Ab are matched in a maximum matching of 
Gb. 

Proof. Construct an edge set E' C Ab x Bb from Mg constructed by algorithm 
m-seq in Figured as follows. E' = {{x,j) | a; G is the j-th element of Mg in 
the column-major order}. Then every edge (x, j) in E' satisfies m(r — 1) -I- 1 < 
j < mr, where r is the column number in Mg at which x is. We first show that 
E' C Eb. Note that t < because m-seq produces an optimal schedule for 
Gsitask{i,i,,)- And we have r > Lg^^i by the following. 

— If X is in the first row in Mg, then r > Lg^-i because x ^ sltask{\, Sx — 1) 
and the task in the first row uses a new time unit after time Lg^_i. 

— If X is in row r such that r > 2, i.e., x = Mg[r, r], then Mg[l, r] ^ sltask{l, Sx — 
1) because Mg[l, r] and x overlap by Fact d and thus r > Ls^-i- 

Hence Lg^-i -I- 1 < r < Since x is adjacent to all t such that mTs^-i + 1 < 
t < mL[^ in Eb, every edge (x,j) in E' is also in Eb. Since j’s are distinct, E' is 
a maximum matching of size n in Gb. 



Lemma 5. The m x Lk matrix Mb is an optimal schedule for Gi. 

Proof. Consider tasks x and y of Gj such that y is a successor of x. Let r and r' 
be the columns of Mb at which x and y are, respectively. Note that Mb has Lk 
columns, which is opt{I), and all tasks are in Mb by Lemmad Since all the tasks 
in the same column of Mb are independent of each other in Gi by Lemma0 we 
can prove that Mb is an optimal schedule for Gj by showing that r' > r. 

Let t and t' be integers matched with x and y, respectively, in the maximum 
matching of Gb. Then t < mLi^ and toLs„-i + 1 < by definition of Gb. Since y 
is a successor of x, we have G < Sy by Lemmad which implies that t' is greater 
than t. Since y must be in a different column of Mb with that of x by Lemma d 
we have t' > r. 



Theorem 2. An optimal schedule for Gj on m processors can be solved in 
0(log^ V {n log n) /v) time with O(nu^-l-n^) operations on the CREW PRAM, 
where v < n is a parameter. 
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Proof. The correctness of algorithm m-schedule follows from Lemma 0 We will 
show that m-schedule takes Oflog^ v + (nlogn) /v) time and 0{nv^ + nf) opera- 
tions on the CREW PRAM. Step 1 takes O(logn) time using 0(n log n) opera- 
tions as follows. In a sorted endpoints sequence, put 1 at the right endpoints of 
Lemma0and 0 in other endpoints and compute Sx and lx using a prefix sum, i.e., 
the prefix sum at x.l is — 1 and the prefix sum at x.r is lx- Since we can com- 
pute all Lj (= len{l,j)) for 1 < j < fc by running algorithm m-length in Section^ 
only once. Step 2 takes 0{\o^ v+ {nlogn) /v) time with Oinv"^ + nf) operations. 
Step 3 takes constant time using 0{n) operations. Step 4 takes 0(log^ n) time 
with O(nlog^n) operations using Dekel and Sahni’s algorithm p|. 

4 Computing the Length of an Optimal Schedule 

We now describe our to-LOS algorithm. We obtain our m-LOS algorithm in 
Figure 0 by generalizing the 2-LOS algorithm in 0. 

Algorithm m-length 

for all i,j with l<i<j<k do in parallel 
compute \sltask{i, j)\ 
leno{iJ) = \\sltask{ij)\/m] 

od 

for r = 1 to [log n] do 

for all i,j with 1 < z < j < /c do in parallel 

lenr{i,j) = maxi<a;<j{Zenr-i(z, x) + lerir-i{x + 1, j)} 

od 

od 

print Zen|-iog„i(l,fc) 

end 



Fig. 3. An efficient m-LOS algorithm 



We now prove the correctness of algorithm m-length. We first define sets 
Xi, ■ ■ ■ ,Xz of tasks for an interval order such that: 

— all tasks in any Xi+i predecessors of all tasks in Xi and 

— the length of an optimal schedule equals ^ - [|Ai|/m] . 

Our sets x/s for m-processor scheduling are the generalization of those for 
two-processor scheduling |3] tailored to the special case of interval orders. We 
do not explicitly compute these sets in algorithm m-length in Figure 0 we only 
make use of them for the proof of its correctness. 

We define the sets Xi, ■ • ■ , Xz of tasks from the schedule Mg computed by 
algorithm m-seq in Figure das follows. We recursively define tasks Vi and Wi for 
i > 1. Let vi be the last task executed by processor Pi (i.e., vi is Ms[l,opt{I)]) 
and wi is (a possibly empty task) Ms[m,opt{I)]. Given Vi, we define Wi+i and 
Vi+i as follows. Suppose that Vi is Mg[l, r]. Let r' be the largest column number 



108 



Y. Chung, K. Park, and H.-C. Kwon 



less than r in Mg such that Ms[m,T'].r > Vi.r or Ms[m, r'] is an empty task. 
Then Wi+\ is and Vi+i is Mg[l,r']. In Figure 0 vi = o, and thus W 2 

is an empty task and V 2 = i- Also = j and V 3 = c. Note that each column 
r" such that r' < t" < t is full. Let z be the largest index for which Wz and Vz 
are defined. We assume that Vz+i is a special interval /3 whose right endpoint 
is smaller than all endpoints in I and = 0. Let ti, 1 < i < z, denote the 
timestep at which Vi is executed. Define Xi to be {x\x is in column t" such that 
Ti+i < t" < rj U {uj. In Figure El sets Xi’s for Gj are marked by thick lines 
in the schedule. The characteristics of x/s are as follows. 

Lemma 6. In Gj, every task x G Xi satisfies x.r < Vi.r. 

Proof. Since Ti+\ is the largest column number less than such that Mg[m, Ti+\ 
].r > Vi.r, we have Ms[m,T''].r < Vi.r for r^+i < r" < Tj. Note that we assume 
that an empty task has the largest right endpoint in I. Since the task in the last 
row in each column has the largest right endpoint in the column by Fact0 every 
task X in column t" such that Ti+i < t" < satisfies x.r < Vi.r. Therefore, 
every x G Xi satisfies x.r < Vi.r. 



Lemma 7. In Gj, all tasks in Xi+i ^’’6 predecessors of all tasks in Xi- 

Proof. Let y be a task in Xi- Since every x G Xi+i satisfies x.r < Vi+i.r by 
Lemma0 we can prove the lemma by showing that Vi+i.r < y.l. Since y.r < Vi.r 
and Vi.r < Wi+i.r = Mg[m, r^+ij.r, we have y.r < Mg[m, r^+ij.r. Since y is at one 
of columns n+i + 1, . . . , r^, we have Mg[l, Ti+i].r < y.r by Facts 0 and 0 Hence 
Mg[l,Ti+i].r < y.r < Mg[m, Ti+i].r. If y overlaps Mg[l,Ti+i] = Vi+i, then y 
should be assigned to column r^+i in m-seq in Figure 0 which is a contradiction. 
Therefore, v^+i.r < y.l. 



Theorem 3. The length of an optimal schedule for Gj is E Ki<z nx*i/H- 

Proof. Since each column t" such that r,_|_i < t" < Ti is full and Vi = Mg[l,Ti] 
is in Xi, we get [Ixil/''^! = r — t' . Therefore, Ei<i<z riwl/”^! is the number of 
columns in Mg, which is opt{I). 

When m = 2, Chung et al. |2j showed that Xi equals slta.sk{ly.^^ + 1, ly.) for 
1 < i < z and that Zenpogn] (*, j) equals len{i,j) ioi 1 < i < j < k. Similarly, 
we can prove the correctness of algorithm m-length as follows. 

Lemma 8. In Gi, Xi C sltask{ly.^^ + l,ly^) for I <i < z. 

Proof. Let a; be a task in Xi- Since Vi+\ G Xi+i is a predecessor of x by Lemma0 
we have Vi+i.r < x.l, which implies < Sx by Lemma 0 Since x.r < Vi.r by 

LemmaEl we have lx < h,. Therefore, x is in sltask{ly,^_^^.i + l,ly,). 

Corollary 1. In Gj, \Ji<t<j Xt C sltask{ly^^^ + 1, lyf) for I <i < j < z. 



An Efficient Parallel Algorithm for Schednling Interval Ordered Tasks 



109 



Corollary 2 . In Gj, all tasks in sltask{ly^^^ + i < j, are successors of 

all tasks in Uj+i<t<z Xt ctncl predecessors of all tasks in Ui<t<i-i Xt- 

Lemma 9. Every task in sltask{ly^^^ + 1,/^J is in one of columns + 

Proof. Let y be a task in sltask{ly._^^^+l, ly.). Note that y satisfies Ms[l, n+ij.r < 
y.l by Lemma El and y.r < Ms[l,Ti + l].l by Lemma 0 Therefore, y must be in 
one of columns r^+i + 1, . . . , by the way algorithm m-seq in Figure 0 works. 

Lemma 10 . In Gi, \\Xt\/m] = len{ly.^^ + 1, lyj forl<i<j<z. 

Proof. The proof for the case to = 2 is in Lemma 8 in [2| and the proof of the 
lemma is similar. 

Lemma 11 . In algorithm m-length, lenr{i,j) < len{i,j) for 0 < r < [logn]. 
Proof. It is similar to the proof of Lemma 9 in 0 . 

Lemma 12 . In algorithm m-length, Zenpogral ihj) ^ lsn{i,j) for 1 < i < j < k. 

Proof. We show that ^enpogn] (1) > len{l,k). We prove by induction on r 

that for i < 2’’, 

/ C7Z j. ( _l_ ^ — 1“ 1,/ujj) ^ E nx.iM (1) 

When r = 0, m holds as follows. Since each column r" such that r^+i < t" < Ti 
is full, \\sltask{ly^^^ + 1,Z„^)|/to] > Hxxl/tn] > n~ n+i by LemmalU Since 
\\sltask{ly^^.^ + l,ly^)\/m~\ < Ti — by Lemma 0 we have \\sltask{ly^^^ + 
l,lvj\/m\ = Ti-Ti+i. Therefore, leno{ly^^^ + l,lyJ = \\sltask{ly^^^ + ljyj\/m] 
— riXsI/’^l- Assume that (0) holds after r iterations of the main loop. In the 
(r -I- l)st iteration for 2^ < i < 2’’+^, 

leriy.^\{ly^^^ -\- ^ lCTly(^ly^_^^ P l,/^^^2r) “t“ ICTly (^ly “t“ 

> \\Xt\/m] + \\Xt\/m] 

x-\-2'^<t<x-\-i x<t<x-\-2^ 

> Y Hxtl/H 

x<t<.x-\-i 

Since each Xi contains at least one task, there are at most n Xz’s. Thus, 

[log n] (^Ifa + l T ^ E rixti/H 

l<i<2 

> len{ly^^^ + I, ly^) by Lemma ITHI 

Since ly^^^ -1-1 = 1 and ly^ = k, we get ^enpogn] (Ij k) > len{l, k). Similarly, we 
can prove that ^enpognl(bi) ^ l^'n{i,j) for 1 < i < j < k hy using sets of Xi’s 
for Gji, where I' is sltask{i, j) in I. 
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Theorem 4. There is an m-LOS algorithm that requires 0{[o^ v + {n\ogn) /v) 
time and 0{nv^ + n^) operations on the CREW PRAM, where v is a parameter 
such that V < n. Furthermore, it also computes the length of an optimal schedule 
for G"sftasfc(l,j) ; ^ Cl j Cl h . 

Proof. The correctness of algorithm m-length follows from T;emmasl1 II a,nd ini 
Algorithm m-length has a straightforward implementation using 0(log^ n) time 
and 0{n^) processors on the CREW PRAM. It can be improved to 0(log^ v + 
(nlogn) /v) time and 0{nv'^ + n^) operations using Galil and Park’s reduction 
technique [S|, which is similar to the proof of Theorem 3 in |2|. 
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Abstract. We consider the problem of scheduling of n independent jobs 
on m unrelated machines to minimize the max(ti, t 2 , tm), ti being 
the completion time of machine i. In 0 was suggested a polynomial 2- 
approximation algorithm for this problem. It was also proved that there 
can exist no polynomial 1.5-approximation algorithm unless P = NP. 
Here we improve this earlier performance bound 2 to 2 — . In fP is also 

proved a general rounding theorem, which allows to construct in polyno- 
mial time 1-job approximations to the optimum, i.e. schedules with an 
absolute bound equal to the largest job processing time. We also improve 
this result and obtain (1 — ^)-job approximation to optimal. 

Keywords: approximation algorithm, distribution, independent jobs, 
unrelated processors, makespan 



1 Introduction 

In this paper we consider one of the classical scheduling problems. We are given 
n tasks and m unrelated parallel processors. The processing time of a task on 
a processor is an arbitrary real number, quite independent from the processing 
time of any other task on that processor and from the processing time of this 
task on any other processor (this is in contrast with the situation with identical 
or uniform processors, when task processing times are more restricted). No task 
preemption is allowed, each machine can process at most one task at a time 
and we wish to minimize max(ti, ^ 2 , im)? ti being the completion time of 
machine i. Even is m = 2 and the processors are identical, the problem is NP- 
hard |2|. Hence, no polynomial algorithm can build an optimal schedule for 
m > 2 processors, unless P yf NP, and we try to approximate the optimum in 
polynomial time. For a given schedule, the optimality (or performance) ratio is 
defined as the ratio of the makespan of this schedule to the optimal makespan. 

* Partially supported by CONACYT grant 980066 and Russian Foundation of Basic 
Researches grant 99-01-00009 

** Partially supported by CONACYT grant ^473100-5-28937A 
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A schedule with the optimality ratio k is called k-optimal. An algorithm with 
the worst-case optimality ratio k is called a k- approximation algorithm. 

If the processors are identical, then a linear time list scheduling algorithm 
which works with arbitrary precedence relations, gives a worst-case ratio 2 — A 
(0)- An O(nlogn) MULTIFIT algorithm gives the optimality ratio for identical 
processors 13/11 and for uniform processors <7/5 (see Hl> 0)' Polynomial 
approximation schemes for uniform processors (the family of polynomial algo- 
rithms with optimality ratios arbitrary close to 1) were first suggested in |7j. 

With unrelated processors, a much weaker approximability results are known. 
The approximation scheme proposed in jH| is polynomial by n but non-polynomial 
by TO. This algorithm with the optimality ratio 1 -I- £ has time complexity 
je) and its space complexity is non-polynomial. For fixed to, i.e., when to 
is not an input on the problem, there is a liner by n polynomial approximation 
scheme by Jansen and Porkolab For non-fixed to, the first polynomial- 
time approximation algorithms for unrelated processors were proposed in 0 
with the optimality ratio to. This result was essentially improved in uni where 
polynomial-time algorithms with optimality ratio within 2y/m were proposed. 
Breakthrough in the area was due to 0 in which a polynomial algorithm with 
optimality ratio 2 was proposed. It was also proved that there can exist no po- 
lynomial algorithm with optimality ratio 3/2 or less, unless P = NP. This work 
was preceded by the paper dH, in which first was brought into the play the 
linear programming for this problem and produced an efficient but still non- 
polynomial by TO algorithm with optimality ratio 2. For a more detailed survey 
of the approximability results see US!. 

In this paper, relying on the results from 0, we present an improved polyno- 
mial algorithm for unrelated processors with the Graham’s performance bound 
for identical processors, i.e., with the worst-case ratio 2 — This is the best 
result so far for to > 2. For to = 2, a linear time algorithm from m gives the 
similar result. 

An absolute error estimates the quality of a schedule in absolute terms and 
is the difference between the makespan of this schedule and an optimal one. The 
rounding theorem from 0 provides with polynomial algorithms which construct 
schedules with an absolute error equal to the maximal job processing time Pmax- 
We improve this result as well presenting a polynomial algorithm with an ab- 
solute error A similar result for identical processors was obtained in 

H2! in Theorem 1.2. 

2 Preliminaries 

In this section, we introduce the basic concepts and notations. A schedule assigns 
each task a processor, and also starting time on that processor, while a distribu- 
tion deals only with the assignment of tasks to processors, but doesn’t care about 
starting times of tasks on the assigned processors. In fact, the literature contains 
a number of results on distributions, we will mention some of them. Explicitly 
this concept has appeared for non-preemptive case under different names. For 
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example, in m it is used the term partition, and in m it is used assignment. 
For preemptive case we will use the term distribution which seems to us more 
adequate. 

For job J and processor M, let us denote by M{J) the time, which takes the 
complete execution of J on M, and by M^{J) we denote the time, during which 
J is processed by M in the schedule a. is the part of J scheduled in a on 

M (if preemptions are allowed, then this ratio can be any real number from the 
interval [0,1], and without preemptions we can have only Os and Is). 

Every (preemptive or non-preemptive) schedule a defines its corresponding 
distribution. The distribution associated with schedule cr is a function which 
assigns to each pair J,M the value Sa{J,M) = < 1- For every job J, 

scheduled in a, we have M) = 1, where the sum is taken over all machines 

M 

in a. 

The concept of distribution may be introduced and investigated indepen- 
dently of the concept of schedule. This concept is simpler. We introduce more 
formal definitions. For the convenience, let us assume that all possible jobs con- 
stitute an universal set of jobs which we denote by JOBS. This set might be 
finite or infinite, but all schedules and distributions are defined only on finite sub- 
sets of JOBS. The processors or machines are defined as functions from JOBS 
to nonnegative real numbers R~^. If M is a processor and J is a job, M{J) is 
the time needed to execute J on M. The set of all processors is denoted by 
PROC. A multiprocessor or a processor system is a finite linearly ordered set of 
processors denoted by = {Mi, M2, M3, . . . Mm}. A multiprocessor consisting 
of m processors is called m-multiprocessor. A job system is a linearly ordered 
finite set of jobs J = { Ji, J 2 , . . . Jn}- A job system with n jobs is called n-job 
system. 

A distribution is a function 6: JOBS x PROC — ?> i?'*', such that for every 

job J, is 0 or 1. Again, 6 {J,M) is the part of job J assigned to 

M£M 

machine M. The jobs, for which this sum takes value 1 are called distributed in 
6 or S- distributed, and the rest of the jobs are called non- distributed. The set of 
all ^-distributed jobs is denoted by JOBS{6). If 6{J,M) > 0 we will say that 
job J is S -distributed on the machine M (or the machine M is 5 -occupied by J). 
We denote by S{J) the set of all machines on which job J in (5 is distributed. 
The set of all (5-occupied machines is denoted by PROC (6). We will say that a 
distribution S distributes a job system J on a multiproeessor M. \ij= JOBS{6) 
and PROC { 5 ) C M. 

Let us note that a convex combination of two distributions, (i.e. p 5 + (1 — 
1 > p > 0) is a distribution. The sum of two distributions S and S' is a 
distribution iff JOBS{ 5 )f\JOBS{ 5 ') — 0. We will call such distributions disjoint. 
If inequality 5 {J, M) < S'{J, M) holds for all J, M, then we will write S C S' and 
say that 5 is a sub-distribution of S', and S' is an extension of distribution S. In 
this case, as it easy to see, JOBS{S) C JOBS(S') and S{J,M) = S'{J,M) for 
any M and ^-distributed job J. 
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Ms{J) = S{J, M)M{J) is the time needed for machine M to execute the 
assigned to it in 6 part of job J. ^ Mg{J) is called the load time of machine 

JeJ 

M in distribution S and is denoted by |5 |m- Tbe load time of machine |ct|m in a 
schedule a is its load time in the associated distribution, so |(t|m = 

J^J 

Let us call the maximal load time of a machines in a distribution 6 its makespan 
and denote it by |i5|max- Tbe makespan of a schedule a will be also denoted by 
I ^ I max ■ 

The distribution b is optimal if it has the minimal makespan among all dis- 
tributions which distribute the same set of jobs on the same multiprocessor. The 
problem of constructing of an optimal distribution is equivalent to the following 
linear programming problem: 

Minimize Dopt 



n m 

^ ^ ^ Lloptj ^ ^ ^ n J — 1, . . . 777 -, ^i,j ^ 0 

i=l j=l 

To see this equivalence, we let Xij = 6{Ji,Mj). For further refe- 

rences, we abbreviate this linear programming problem by LP(Dopt)- 

Construction of an optimal schedule can be split into two stages. On the first 
stage we construct an optimal distribution, and on the second stage we construct 
an optimal schedule with this distribution. If the distribution is non-preemptive, 
the second stage is trivial. The problem of construction of an optimal schedule, 
associated with the given distribution, was thoroughly investigated in jI3]. The 
concept of an open shop, introduced in this paper, is almost the same as that 
of a distribution. Let <5 be a distribution of on M. Then one can interpret 
every J G J as a job consisting of m subtasks, where task number i has to be 
performed on processor Mi and its execution takes the time {Mi)s{J). In this 
way every distribution generate an open shop and vice-versa. 

Unfortunately, not every preemptive distribution has an associated schedule 
with the same makespan. The processing time of a job J in a distribution S 
is defined as the total time during which this job is processed on all machines, 
i.e. Let us denote by the maximum of So |<5|““ is 

M 

the largest processing time in 6. Since each job has to be performed sequentially, 
i.e., a job cannot be processed at any moment by more than one machine, the 
makespan of every schedule a associated with a given distribution S cannot 
be less than Let us call the sequential makespan of a distribution the 

maximum between |5|i„ax and We see that the makespan of every schedule 

associated with a given distribution cannot exceed its sequential makespan. Now, 
a principal result from US! can be formulated as follows. 

Theorem Gonzales and Sahni m) There exists a polynomial algorithm 
which constructs for every distribution an associated schedule with makespan 
equal to the sequential makespan of this distribution. 
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Based on this theorem, in m the problem of constructing of an optimal 
preemptive schedule is reduced to the following linear programming problem 



Any solution of this problem gives a distribution with the minimal sequential 
makespan. 

3 Previous Results on Rounding 

Thus the construction of an optimal preemptive distribution is polynomially 
solvable in opposite to the non-preemptive case, which is a subject of our study. 
Let us first give the rounding approach, first applied in m and later essentially 
improved in [Q. This approach allows us to produce good approximation to an 
optimal non-preemptive schedule. 

For a real x, let [x] and {x} be its integral and fractional parts, respectively 
(we note that we use {.} for representation of sets as well). So x = [x] -I- {x}, [x] 
is integer and 0 < {x} < 1. [6{J,M)] and {c5(J, M)} define distributions [<5] and 
{i5}, which we will call the integral and the fractional parts, of S, respectively; 
clearly, [(5] and {i5} are disjoint. A distribution is integral or non-preemptive if 
its fractional part is 0, and in this case it coincides with its integral part. Jobs 
distributed by {J} are said to be preempted in S. An integral distribution J is a 
rounding of another distribution 5' if it distributes the same jobs and 5 = [J']. 
So J -I- Jo, where Jo = {J^}- 

To find a non-preemptive distribution, close to an optimal one, the rounding 
method looks for an optimal extremal preemptive distribution and rounds it. 
A distribution J is extremal if it cannot be represented in the form f (J' -|- J"), 
where S' and S" are different distributions, such that for every machine M, 
\S'\m = \S"\m = \S\m- The importance of extremal distributions shows the 
following 

Extremality Principle. All distributions constructed by linear programming 
solution of LP(Dopt) cife extremal. 

Proof. Denote by A the set of all distributions of an n-job system on an m- 
multiprocessor A4 . This set represents a convex (possibly unbounded) polytope 
in space i?”"*. In LP{D opt), consider a subset A' of product i?™" x R defined 
by conditions A' = {(J, D) \ 5 & A,D & R, | J|„iax < D}- 

If J = ^(Ji -I- J 2 ) where |J|m = |Ji|m = IJ 2 IM for all machines M, then 
(S,D) G A' implies {5i,D) G A' for i = 1,2. Hence, {S,D) is a middle point 



LP{Sopt). 

Minimize ^opt, 



n 



m 




n 
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of (Si,D) and (^2,-0), and therefore is not a vertex of Z\'. But the linear pro- 
gramming works only with vertices of A' . Hence they work only with extremal 
distributions. 

The rounding method is based on the following known principle (see 
El. n). which will be proved the next section. 

The Preemption Bounding Principle. If S is an extremal distribution 
on a multiprocessor A4 — {Mi . . . Mm}, then the number of jobs in JOBS{{6}) 
(i.e., preempted jobs in S) is less than m. 

Let us remark that for an integral distribution S, 6{J) represents a unique 
machine. Let us say that an integral distribution i5 is 1 — 1 if 6{J) ^ 6{J') for 
every ^-distributed jobs J and J' . Since the number of preempted jobs in an 
extremal distribution does not exceed the number of machines (see the Preemp- 
tion Bounding Principle), we can accomplish a 1 — 1 rounding, i.e., to find a 
distribution S' , for which (5' — 5 is a 1 — 1 distribution. In particular, applying 
1 — 1 rounding to an optimal extremal distribution, we immediately obtain the 
following result (jpmax below is the maximal task processing time): 

A 1-job Approximation Theorem. It is possible to construct a distribu- 
tion with the makespan, exceeding the optimal makespan by no more than Pmax 
in polynomial time. 

As we will see below, even an optimal 1 — 1 rounding can be accomplished 
in polynomial time. Let us define the selection problem as follows. We say that 
a given family of subsets of a set X is selectable if there exist such 

sequence of point xi, . . . Xk € X (called selection), that Xi € Xi for alH < fc and 
all Xi are distinct. 

This selection problem can be easily solved via the complete matching pro- 
blem in a bipartite graph {V\,V 2 ,E}, with vertices V\ = X and V 2 = {1.2,. ..A:}, 
where pair x, i is an edge iff a: G A^. This matching problem is known to be po- 
lynomially solvable via the maximal flow algorithm. 

Due to the Hall’s Marriage Theorem (see, for example ^01). our selection 
problem has no solution (i.e., there is no complete matching) iff there exists a 
subset Y G X with the number of elements, less than the number of elements in 
the specially defined subset of V 2 - In particular, this subset contains an element 
I G V 2 iS there exists y G Y, such that y G Xi. 

Lemma 1. Let J be an n-job system, M be an m-multiprocessor, m > n and 
{ci}i<n be real numbers. Then it is possible to construct in polynomial time a 
1 — 1 distribution S of J on M., such that \S\Mi < Q for all i, or to prove that 
such distribution does not exist. 

Proof. For every job J let A4{J) = {M G M. \ Mi{J) < ct}. The constriction of 
1 — 1 distribution with |<5|Mi < G is equivalent to the selection problem for the 
family M{J). 

This lemma with lemma 1 in ^ gives us the following: 

Theorem 1. Let J be an n-job system, and A4 be an m-multiprocessor, such 
that n < m. Then the optimal 1—1 -rounding can be constructed in polynomial 
time. 
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4 Acyclicity of Distributions 

In m it is considered an optimization problem for the distributions with the ad- 
ditional restrictions, which forbid some jobs to be distributed on some machines. 
The linear programming problem corresponding to this restricted optimization 
problem can be obtained from LP(Zlopt) if some inequalities of the type > 0 
are changed to the equalities Xij = 0 (x^ = 0 eliminates the possibility of distri- 
bution of any part of ith job to jth machine). The Extremality Principle from 
Section 3 also holds for this problem; the proof of this fact is similar to that of 
Section 3. 

To specify a restricted optimization problem, we assign to each job J the set 
of machines M.j, the ones, on which it is allowed to distribute J. We shall call 
such assignment a job-machine configuration. We will use C to denote the con- 
figurations. Formally, a job-machine configuration C is a multi-valued mapping 
C: JOBS — PROC . The set of machines, on which job J is distributed in C is 
denoted by C( J). We will consider only finite configurations; for some Js, C(J) 
might be empty. 

A distribution 5 is called restricted by configuration C or C -restricted if 
6{J) C C(J) for all J G JOBS. The set of C-restricted distributions is con- 
vex. A distribution of a job system ff with the minimal makespan, among all 
C-restricted distributions 

For a distribution 5, in PJ the so called configuration graph G{6) is introduced. 
G{S) is a bipartite graph {Vi, V2, A}, such that V\ = JOBS{6), V2 = PROC{5) 
and there is an edge, corresponding to the pair J, M in G((5) iff 6{J, M) > 0. We 
will call a distribution connected if its configuration graph is connected. 

A sub-distribution 5' of a distribution 5 is called its component if G(S') is 
component of connectedness of G(d) . From the definitions immediately follows 

Lemma 2. Por different components Si and Sj the respective sets of occupied 
machines are disjoint, i.e., PROG{5i) D PROG{6j) = 0. 

Lemma 3. Let S, S' and S" be such distributions that PROG{S)C\PROG{S') = 
PROC{S)C\PROG{S") = 0 and \S'\m = \S"\m for all machines. Then |5'-I-i5|m = 
\S" -\- S\m for all M . 

Lemma 4. If S = |(i5' -I- S”) then S{J) = S'{J) U S”{J) for all J. 

The proofs are left to the reader. 

Lemma 5. A distribution is extremal iff all its components are extremal. 

Proof. Suppose S has a non-extremal component Sq such that So = 5(^0 + ^ 0 ) 
and |(5 o|m = I^oIm = I^oIm- Then processors occupied in Sq and S'f are included 
in PROG{Sq). Let = i5 — Sq. Then PROG{Sq) H PROG{Si) = 0. Further, let 
y = + y and S" = S'f + Si. Then J = I{S' + S") and \S'\m = \S"\m = |<5|m 

for all M. Now the load times are equal because of lemma 0 
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In opposite direction, suppose 6 is not extremal and consider its representa- 
tion S = \{5'+5”) where |i5|m = W\m = W\m for all M. Let J be ajob for which 
6'{J) ^ S''{J) and 5q be a component of 5 for which J G JOBS{So)- Denote by Sq 
and Sq restrictions of S' and S" respectively, on JOBS{Sq)- Then So = 5(<^o + '^o) 
and to prove the non-extremality of it is sufficient to check equality of the 
load times. Let M G PROC{Sq). As S'{J) C S{J) for all J ^ JOBS{So), we 
obtain M ^ S'{J). Hence, all jobs distributed by S' on M belong to JOBS{Sq) 
and are distributed by Sq- Therefore, \S'q\m = \S'\m = |<5o|m- If M ^ PROC{So), 
then for all J G JOBS{Sq) M ^ S{J) and hence M ^ 5'{J). This implies that 
the load time of M in <5g, as well as in <5o, is 0. Thus we proved equality of the 
load times for dg. We use the similar reasoning for S'q' and the lemma is proved. 

Let us say that a distribution S is synchronous over multiprocessor M if 
|<5 |m = \S\m' for all M, M' G M. 

Lemma 6. A connected C- optimal distributions is synchronous over PROC{C). 

Proof. Suppose S is not synchronous and let M he & machine for which |<5|niax > 
Consider a machine M' with the maximal load time. Since G{S) is connec- 
ted, there exists a path in it connecting M and M' . Let M = Mi, Ji, M 2 , J 2 , 

. . . , Mfc+i = M' be such a path and let i be the maximal index, such that 
I 1 Mi < I <5 1 max- Ji occupies both Mi and M^+i. Let us choose £ > 0 so small 
that e < <5(Ji,Mi+i) and eMi{Ji) + S{Ji,Mi) < 1, and define a new distribu- 
tion S' as follows. S'[ji, Mi) = S{Ji, Mi) + e, if <5'( A, Mj+i) = S{Ji, Mi) — s and 
S' (J,M) = S{J,M) for any other job-machine pair. 

The obtained distribution S' contains less machines with the load time |<5|max> 
and has the same configuration graph as S. Repeating the above procedure, we 
can construct a distribution with the same configuration as S and with a smaller 
makespan. But this contradicts the optimality of d. The lemma is proved. 

The number of machines in S{J) minus 1 will be called the number of pre- 
emptions of J in 5 and will be denoted by tts{J)- 7t((5) = ^ t^s{J) is the total 

JeJ 

number of preemptions in <5. 

Lemma 7. The total number of preemptions in every connected C-optimal ex- 
tremal distribution S is strictly less than the number occupied machines in S. 

Proof. Let <5 occupy machines Mi, . . . Mm and let Ji, J 2 , . . . Jk be the preempted 
jobs in d. For every i < k, let j{i) be the first j for which Mj{Jf) is fractional. 
Denote by P the set of pairs i,j, j yf j{i) and such that 0 < Mi{Jj) < 1. The 
number of elements in P is 7 t(i 5). Let e = min{Mi( For every real 
function / on P, such that \f{i,j)\ < s/m, let Sf{Ji,Mji^if) = S{Ji,Mj(^i)) — 
Y. fihj), Sf(Ji,Mj) = S{Ji,Mj) + f{i,j) if (i,j) G P, and let Sf{J,M) = 

S{J,M) otherwise. The distribution Sf, as it follows from its definition, is C- 
restricted. To define a linear mapping I of a s/m-cube of Euclidean space Q 
of dimension 7 t(( 5) into (m — 1) dimensional space, enumerate pairs in P. Then 
each point x € Q corresponds to a real function f^ on P, and we can define 




120 E.V. Shchepin and N.N. Vakhania 



l{x) as a vector in which ith component is \Sf^\Mi — \5\Mi- If 7r(<y) > m, then 
the kernel of the mapping I is nontrivial. Let y € Q he a, nonzero point for 
which l(x) = 0 and let / be the function corresponding to x. If I^/Im^ = l<^lMm) 
then |(5 _/|m„ = we come to a contradiction with the extremality of 

S, because S = ^{S-f + Sf). The optimality of S implies that \Sf\Mm > 

Indeed, if \5f\Mm. < then (5/ is as well optimal, but not synchronous. But 

the same reasons applied to 5-f gives us the similar inequality \5-f\Mm > 

Then we come to a contradiction with 5 = ^{S-f + Sf). The lemma is proved. 

We will call distribution acyclic if its configuration graph is acyclic. Let us 
say that a distribution 5 is componentwise C -optimal if each its component is 
C-optimal. Our main result now can be formulated as follows. 

Theorem 2. Every componentwise optimal and extremal distribution is acyclic. 

Proof. If a distribution is connected, then the number of its preemptions is 
less than the number of the corresponding occupied machines (by lemma El) . 
Hence its configuration graph has less edges than vertices. For a connected graph 
this implies the acyclicity. If the distribution is not connected, then consider its 
components. They are extremal by lemmaEl They are as well C-optimal by our 
assumption. Hence the same argument shows their acyclicity. 



5 Consolidation of Distributions 

Let us say that a non-preemptive (integral) distribution ^ is a consolidation of 
a preemptive distribution 6', if it has the same domain and 6{J,M) = S'{J,M), 
for all J which are not preempted in 5' . 

The main result of this section is the following theorem 

Theorem 3. For every acyclic distribution S on an m-processor, it is possible 
to construct in polynomial time a consolidation S' , such that |^'|max < l^lmax + 

m-l 5 
m ^max 

The proof is based on some delicate considerations connected with graphs. 
Besides the introduced earlier configuration graph from shall consider 

another type of graph for presenting the preemptive structure of the distributi- 
ons. We call it a preemption graph and denote it by Gs- Gs has less nodes and 
edges than the corresponding configuration graph. 

The nodes of Gs represent the machines, and edges represent the jobs. There 
is an edge in joining a pair of nodes in Gs, if the job which represents this edge 
is shared by the machines which represent these nodes. The preemption graph, 
in general, is a multi-graph. But if the configuration graph is acyclic, then the 
corresponding preemption graph is simple. Indeed, it is sufficient to prove that 
S{Ji) n 5 (^ 2 ) cannot contain two machines. But if it contains two machines Mi 
and M 2 , then we will have a nontrivial cycle Mi, Ji, M 2 , J 2 in G{S). 

Preemption graph may have cycles even if configuration graph is acyclic. For 
example, if S{J) contains three machines Mi, M 2 and M 3 , then the vertices 
corresponding to Mi, i = 1 , 2 , 3, form a cycle in Gs- 
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A reduced preemption graph G'g has less edges than Gs has and this graph 
is acyclic iff G{6) is acyclic. The structure of G'g (unlike that of Gs and G{6)) 
depends not only on the distribution 6, but also on the order, in which the ma- 
chines are numbered. G'g is a subgraph of Gs, obtained by deleting the so called 
redundant edges in Gs- Again, whether an edge is redundant or not, depends on 
the order in M-. An edge {x,y) € Gs is redundant, if there exist two (or more) 
intermediated edges (x,z) and (z,y), such that the index of machine z is more 
than that of machine x and less than that of machine y, and all x,y and z share 
the same job. 

Proposition 1. G'g is acyclic iff G{5) is acyclic. The number of edges in G'g is 
equal to the number of preemptions in the distribution 6. 



Lemma 8. Let Pi, P 2 , . . ■ , Pk be connected subgraphs of an acyclic graph G, 
having in common at most one node. Then for some i, the intersection of Pi 
with UPj, j = 1,2,..., k,j ffi, is a single node. 

Proof. Let iVi , . . . Nk be all nodes of G, which are common for some pairs of 
our connected subgraphs, and let G' be the minimal connected subgraph of G 
containing all NiS. The acyclicity of G implies the acyclicity of G' . Besides, G' 
does not have any single degree node, different from some Aj. Indeed, if N were 
a single degree node in G' different from any Ni then we would reduce G' by 
eliminating N and the corresponding edge. 

Let A be a single-degree node of G' and e be the edge in G', corresponding to 
A. As A = PiC\Pj for some i,j, either Pi or Pj, suppose Pi, does not contain e. 
Suppose that Pi contains a node Nj ^ A. In this case we will have two different 
paths between Aj and A in G. The first such path is contained in Pi and does 
not pass through e, and the second one is in G' and passes through e. But then 
we would have a cycle in G which is a contradiction. Therefore, the intersection 
of Pi with the union of the rest of our connected subgraphs is exactly A and the 
lemma is proved. 

A weight function on the graph G is a function w which assigns to each node 
A of G a nonnegative number w{N), and such that the sum of all w{N) is equal 
to 1. Let us say that a weight function w is supported by a subgraph P of G if 
it takes value 0 for all nodes in G\P. 

Lemma 9. Let G be an acyclic graph with m nodes, Pi, P 2 , . . . , Pk its covering 
by connected subgraphs which intersect in more than in one node. Further, let 
wi,W 2 , . - . ,Wk,k < m be a weighted function on G such that Wi is supported 
by Pi- Then it is possible in polynomial time to construct a sequence of nodes 
si, . . . , Sk, such that Si G Pi for all i, and X) (1 ~ Wi{N)) <1 — h 

Si=N ™ 

Proof. Applying the above lemma we can first order subgraphs Pi, P 2 , . . . Pk 
of our decomposition in such a way that for all i < k. Pi intersect the union 
P* = Uj^iPj in exactly one node, which we denote by Ni. 
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In the process of construction, we will need to keep an additional information 
for every node. An array of nonnegative integers v{N). The algorithm is the 
following. 

Step 1. (initialization) v{N) := 0 for all N, j := 0 (counter of cycles) 

Step 2. Search for a node M hi Pi, different from Ni and such that 

Wi{M) > v{M)/m+ Ijm, 

(case 1): if such a node is founded then Si := M 

(case 2): otherwise, Si = Ni, v{Ni) := v{Ni) + ^ (v{M) + 1) 

M£Pi\Ni 

Step 3. if j < A: then j:=j+l; goto step 2; else STOP 

To prove correctness of this algorithm, first note that ^ v{M) is no more 

M£P' 

than the number of nodes in complement to P*, for all stages in our construction. 
Indeed, from P®~i to P® we increase ^ v{M) exactly by the number of the 

MeG' 

deleted nodes from Pi, or we do not change it at all. 

The algorithm works without stopping up to the last step. When we choose 
Sfe there is no N^- So we have to find an M satisfying inequality Wk{M) > 
v{M)/m + 1/m. Suppose that there is no such M. In this case, for all M we 
have Wk{M) < v{M)/m+ 1/m. If we sum these inequalities, for all M G Pk, we 
obtain 1 < ^ + !)■ + 1) as already noted, does not exceed 

m and we came to a contradiction. 

Let us note that if Si is different from Ni, then Wi{M) > v{M) /m + l/m and 
this point cannot be chosen in the sequel. On the other hand, during the whole 
process, for all i we have that 

^ (1 - Wj{N)) < v{N)/m, 

j:Sj=N 

because v{N) is increased only when N = Ni is selected, for some Pj, and it 
is increased by ^ {v{M) + 1). But the condition of selection of Ni is that 

M£Pi\N 

Wi{M) < v{M)/m + l/m for all M G Pj. And sums of these inequalities provide 
desired results. To finish our prove it is sufficient to note that v{N) < m — 1 for 
all N. 

Proof of Theorem El Let G be a reduced preemption graph of fractional 
part of our distribution. This graph is subgraph of G'^ and therefore is acyclic. 
Let Ji, . . . Jk be all jobs preempted in 5. For every job Ji denote by Pi the 
subgraph which edges correspond to Ji. For every node M of this subgraph, let 
Wi{M) = 6{Ji,M). Then we obtain decomposition {Pi} of G and the system 
of weights. By lemma M we choose for each job Ji, a machine M{i). Now we 
define 6' {Ji, M{i)) = 1 for all i. This completely defines this consolidation. The 
increase of the load time on machine M in S' compared with that in S is equal 
to S (1 ~ 5{Ji,M))M{Ji). As M{Ji) < *) this sum does not 

exceed (1 — ;^)Pmax- The theorem is proved. 
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6 The Worst-Case Bounds 

As a simple consequence of the acyclicity and consolidation theorems, we build 
the approximation algorithms in this section. 

Given a configuration C and job system J , construct by the linear program- 
ming a C-optimal distribution <5 of J . Decompose it into components S = 

For all i, construct by linear programming extremal C-optimal distributions S[ of 
JOBS(Si). Then the sum ^ 6^ represents an extremal componentwise C-optimal 
distribution of J'. This distribution will be acyclic owing to the Acyclicity Theo- 
rem. Now applying consolidation algorithm of Theorem El we obtain a distribu- 
tion which properties are given in the next theorem. Denote by p^ax maximum 
of M(J), where M G C{J), and denote by the makespan of C-optimal 
distribution of J. 

Theorem 4. For every joh system J and every configuration C it is possible to 
construct in polynomial time an integral distribution S, such that 

I Cl ^ m — I c 

I <5 1 max ^ ^opt “I Pmax 

If we consider a constant configuration, i.e., a configuration C, such that 
C( J) = A4 for all J G J , then, from the above theorem, we immediately obtain 
the following improved version of the 1-job approximation theorem. 

Corollary 1. There is a polynomial algorithm which constructs an integral dis- 
tribution of a job system J on a multiprocessor A4 with a makespan, exceeding 
the optimal one by no more than ^^^j^Pmax- 

Now we find it useful to recall some basic ideas behind the earlier mentio- 
ned polynomial 2-approximation algorithm from reference fp. One of the useful 
concepts, introduced in ^ was what we call the balanced distribution, a dis- 
tribution, in which the jobs cannot be distributed on the machines, on which 
their execution time is “sufficiently large”. More precisely, for a distribution <5, 
let us denote by p^^x the maximum of {M{J) \ 6{J,M) > 0} and call it the 
S-largest processing time. Let us say that for some B G R~^ 6 is B-balanced, if 
|<J|max < B and p^ax ^ Denote by i?opt the minimum B for which there 
exist a B-balanced distribution. Note that Dopt < .Bopt < D°p*. Indeed, every 
non-preemptive distribution 6 is auto-balanced (that is |^|max-balanced) and this 
implies that Bopt < B°p*. We call an optimally balanced distribution a distribu- 
tion which is Bopt-balanced. Bopt can be found in polynomial time as it is shown 

in ra- 
in the polynomial 2-approximation algorithm from > first an optimally ba- 
lanced extremal distribution S is constructed and then consolidated. An integral 
distribution <5 is a consolidation of a distribution S' , if S and S' distribute the 
same job system ff on the same multiprocessor A4 and S{J,M) = S'{J,M) 
provided by integrality of S{J,M). Consolidation is a special sort of rounding. 
To build a consolidation from a given distribution, for each preempted job in 
this distribution a single machine is determined and the job is completely placed 
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(scheduled) on that machine. If 5' is a consolidation of a B-balanced distribution 
S, then I (5' I max < |i^|max + . Hence, a consolidation of an optimally balanced 

distribution has the makespan < 2i?opt < 2 H°p*. 

An ingenious trick is developed in ^ to consolidate an i?-balanced distri- 
bution obtained by linear programming. This consolidation is produced by a 
matching in the corresponding configuration graph and is 1 — 1. That is different 
preempted jobs occupy in consolidation different machines. This trick is based on 
the analysis of structure of the consolidation graph of the above distribution. It 
is proved that this graph is a pseudo-forest i.e., a graph with its all components 
having the edges no more than the nodes. 

Let us denote by i?°P* the minimal possible makespan of auto-balanced dis- 
tributions. Auto-balanced distribution with such makespan call optimal auto- 
balanced. This distribution is extremal and hence acyclic. That is for such dis- 
tributions the configuration graph is a forest. As easy to see i?opt < B°p* < T)°p*. 
The same argument as presented in ra show that i?°P* is calculable in polyno- 
mial time. This _B°p‘ represents a best known polynomially calculable lower 
estimation for Z?°p*. 

The consolidation applied to the optimal auto-balanced distribution (which 
is not in general 1-1) gives a 2 — ^-approximation to optimum. 

Theorem 5. For every given configuration, a non-preemptive distribution with 
optimality ratio 2 — 1/m can be constructed in polynomial time. 

Given a configuration C, construct by the linear programming a C-optimal 
extremal distribution S. This distribution is acyclic 

Theorem 6. There is a polynomial algorithm which constructs a schedule a 
with makespan exceeding an optimal schedule by no more than ^ff^ Pmax- 
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Abstract. This paper proposes a new interpolation method based on 
Kohonen self-organizing networks. This method performs very well, com- 
bining an accuracy comparable with usual optimal methods (kriging) 
with a shorter computing time, and is especially efficient when a great 
amount of data is available. Under some hypothesis similar to those used 
for kriging, unbiasness and optimality of neural interpolation can be de- 
monstrated. A real world problem is finally considered: building a map of 
surface-temperature climatology in the Mediterranean Sea. This example 
emphasizes the abilities of the method. 



1 Introduction 

Physical data interpolation is a common issue in Geosciences. For many variable 
of interest, the measurements are often sparse and irregularly distributed in time 
and space. Analyzing the data usually requires a numerical model, which samples 
the data on a regular grid. Mapping irregular measurements on a regular grid is 
done by interpolation, which aims to generalize, but not to create, information. 
A popular method to map geophysical data is kriging p. 

This method, based on the hypothesis that the measurements are realizations 
of a random variable, has been proven to be optimal under certain conditions. 
It requires to solve a system of linear equations at each point where the inter- 
polation must be done, which might be computationally heavy. 

This paper proposes an original interpolation method based on Kohonen self- 
organizing networks. The method is applied on the problem of building a surface- 
temperature climatology in the Mediterranean Sea. The method performs very 
well, combining an accuracy comparable with usual kriging methods with a much 
shorter computing time, and is especially efficient when a great amount of data 
is available. 

The paper is organized as follows. Section 2 recalls the backgrounds of kri- 
ging techniques. Section 3 describes the adaptation of self-organizing maps to 
the spatial interpolation problem. The results of actual data interpolation in 
an oceanographic problem are presented and discussed. The last section draws 
conclusions and perspectives. 

J. van Leeuwen et al. (Eds.): IFIP TCS2000, LNCS 1872, pp. 126-^^^ 2000. 
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2 Optimal Interpolation 

A model of a physical variable aims at predicting its value anywhere at any time. 
The simplest model is a numerical one, that is a discrete representation of the 
variable. To be efficient, this representation must be done under two constraints: 
on the one hand no information must be missed, on the other hand a reasonable 
amount of storage capacity is required. It must also be done on a regular grid, 
in order to be usable by most analyzes tools (plotting a map of the variable, 
computing Fourier Transform, ...). 

2.1 Definition of Interpolation 

Considering n values (obtained by measurements) of a variable Zi at locations 
Xi^l < i < n, interpolation aims at building a numerical model of the variable 
on a regular pre-defined grid. A straightforward way to interpolate data on a 
specific location (a point of the grid) is to make a linear combination of the data: 



where Z* is the estimated value. The problem is to compute the weights Xi in 
order to minimize the estimation error. Practically, this is not feasible, because 
the true values are not known. It is thus necessary to make assumptions on the 
behavior of the variable to define the optimality. 

The simplest methods give higher weights to the nearest data. The weights 
are somehow inversely proportional to the distance. This corresponds to an impli- 
cit assumption of continuity of the variable, which seems reasonable for physical 
variables. Anyway, it is possible to do better, taking into account the spatial 
correlation of the data. In this case, the weights are the solutions of a system 
of linear equations, that can be obtained by writing the minimization of the 
estimation error. This is kriging. 

2.2 Kriging 

Kriging is based on a statistical interpretation of the measures. Indeed, it assumes 
that the data are realizations of a random variable, that is: Zi = Z(xi). Some 
hypothesis are required on the behavior of this random variable, usually that 
the expectation of an increment is null, and its variance only depends on the 
distance (intrinsic random variable p]): 



n 




( 1 ) 



E[Z{x + h) - Z{x)] = 0 
Var[Z(x + h) — Z{x)] = C{h) 



(2) 

(3) 



Therefore, on each point xg where the interpolation is to be done, it is possible to 
write analytically the expectation and variance of the estimation error Z*(xq) — 
Z(xo). 
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Unbiasness. The nullification of the expectation (ensuring that the estimation 
is not biased) leads to a constraint on the weights A^: 

n 

= l (4) 

i=l 



Optimality. The minimization of the variance (that is the optimality of the 
estimation) under the constraint of Eq. 0 leads to a system of linear equations, 
with coefficients depending on a model of the variance of the increment of the 
data jn|: 
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where Cij = Var[Z{xi) — Z{xj)] = E[{Z{xi) — Z{xj))'^] and /x is a Lagrange 
multiplier. 



Estimation Error. Once the weights are found, it is also possible to compute 
the (residual) variance of the estimation error at each point where an estimation 
is performed: 

1 " 

Var[Z*{xQ) - Z{xo)] = -'^XiCio ( 6 ) 

i=l 

This approach is also called objective analysis. 

When a great amount of data is available, kriging at each point cannot be 
performed using all data, because it would lead to huge systems that may not 
be handled. Instead, it is necessary to choose a few data around the point where 
to interpolate. Furthermore, these data have to be chosen to avoid singularity of 
the system, which is usually done with the help of many geometric parameters 
in kriging products. Anyway, it remains that a system of linear equations must 
be solved on each point of the final grid, which is computationally heavy. 

The main advantage of kriging is that it relies on strong theoretical backgro- 
unds, which demonstrate that the interpolation is unbiased and optimal. The 
main drawbacks are: 

— The hypothesis done on the random variable are strong. It is possible to 
relax them (allowing a determinist drift on the data for example), but the 
kriging system is then more complex. In any case, a model of variance of the 
increment of the variable must be computed, which can be very long when 
a lot of data is available. 

— It is difficult to ensure that a system built with some data, even carefully 
chosen, will be regular. Therefore, especially with big data sets, it is possible 
to have wrong estimates. Very wrong estimates can usually be detected, 
because they simply are out of the range of the variable. Anyway, it remains 
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the possibility to have wrong estimates that are not far enough from the 
expected value to be detected. 

— To make a numerical model on a grid, it is necessary to interpolate, that is 
to solve the system of equations, on each point of the grid. This again might 
be very long, depending on the desired resolution of the model. 



3 Neural Interpolation 

Kohonen networks are artificial neural networks. Some work has already been 
done and is presented elsewhere on their use for adaptive meshing. It was 
shown that a simple modification of the basic Kohonen self-organizing algorithm 
(to constrain the peripheral neurons of the network to stay on the border of the 
domain) allows to produce valid meshing, with some advantages over classical 
methods. The use of Kohonen networks for neural interpolation also relies on a 
slight modification of the basic self-organizing algorithm. 



3.1 Kohonen Self- Organizing Networks 

In their widely used form, Kohonen networks consist of a matrix of neurons, each 
neuron being connected to its four nearest neighbors through fixed connexions 
(this form is called a map). All neurons are also excited by the same input, a 
vector of any dimension, through weighted connexions (figureQ]). The role of the 
fixed connexions is to create a competition process between the neurons, so that 
the one whose weights are the closest to the current input produces the higher 
output. This competition is usually simulated by a simple computation of the 
distances between the input and all the neurons, and selection of the neuron 
with the smallest distance. This neuron is called the cluster of the network. 




Fig. 1. Structure of a self-organizing map. 
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Kohonen has proposed a learning rule, that modifies the weights of the net- 
work, so as to produce interesting representations of the input space 0 . Indeed, 
at the end of the learning process: 

— each neuron is sensitive to a particular zone of the input space; 

— neighbor neurons are sensitive to near zones; 

— the neurons distribution tends to approximate the probability density of the 
inputs presented during learning. 

For these reasons, the usual representation of a self-organizing map is done by 
plotting the neurons, linked by their fixed connexions, using the weights as co- 
ordinates in the input space (see figure El • 

Learning Rule. Let: 

— p be the dimension of the input space; 

— x{t) = (xi(t), X 2 (t), Xp(t)) be the input vector at time t; 

— w^{t) = {wi{t),W2{t), ...,Wp{t)) be the weight vector of neuron k at time t. 

At each time step t, the cluster c{t) of the network is searched: 



where the norm is usually the euclidian distance. The weights of the neurons are 
then adapted with the following rule: 



where a(t) is a time-decreasing gain factor, and h(k, c(t)) a neighboring function. 
This function depends on the topologic distance, measured on the map, between 
the neuron k and the cluster c(t) . It takes a maximum value of 1 if the distance 
is null (neuron k is the cluster), and decreases when the distance increases. The 
topologic distance between two neurons is the distance between their row and 
column indices in the map. 

This algorithm is very robust, and numerous constraints can be applied to the 
neurons without changing its properties. For example: 

— Initializing the network as a regular grid in the whole space considerably 
reduces the computation time. 

— Constraining the peripheral neurons of the network to slide on the border 
of the domain allows the production of naturally adapted meshing [Oj. The 
right part of figure Q gives an illustration of such a meshing, produced with 
the same parameters as the left part, except the constraint. 



c{t) = c{{w^{t)},x{t)) 

= k/Ww\t)-xm<\\w\t)-xm yi 



(7) 



w^(t -I- 1) = w^(t) — a{t)h{k, c{t)){w'^{t) — x{t)) 



( 8 ) 
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Fig. 2. Two maps organized using data chosen randomly on the black squares of a 
chess. The only difference between the two is that peripheral neurons are constrained 
to slide on the border of the domain on the right. 



3.2 Neural Interpolation Algorithm 

At each time step t, the input x{t) of the network is a three-dimensional vector, 
the first two dimensions giving the location of the measure, and the third one 
its value. We will note this decomposition: x{t) = {xioc{t),Xyai{t)) (loc =1,2 
and val = 3). Each neuron k thus has a three-dimensional weight vector w^{t). 
The only required modification of the basic self-organizing algorithm is that the 
selection of the cluster c{t) be performed on the first two dimensions only: 

c{t) = c{wt^{t),xioc{t)) 

= fc/lkfocW - Xloc{t)\\ < ||wLW - Xloc{t)\\ yi (9) 

This means that the cluster is chosen in the geographical space, according to 
the data location only, and is completely independent from the measure value. 
The idea is to trust the locations rather than the measures, allowing thus very 
different values measured on close points to be combined. Once the cluster is 
found, the weight modification applies on all three weights of the neurons, as 
presented in Eq. 0 

In this approach, the interpolation points cannot be chosen beforehand. In- 
stead, they are determined during the learning process, and correspond to the 
final locations of the neurons. Therefore, at the end of the algorithm, we still 
do not have the values on a regular grid, and a post-processing is required. The 
question is thus: what is the advantage of the final irregular distribution of the 
neurons over the initial irregular distribution of the data? If the number of neu- 
rons is lower than the number of measures, complexity is reduced without loss of 
information. The neurons set is the best representation in space of the data set 
0. If the associated values are optimal, it is then possible to re-grid the values 
with a simple interpolation method, with no loss of precision. 
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At any time step t, it can easily be shown that the weight vector of a neuron k 
is a combination of the input vectors presented until then and of its initial value: 

t t t 

+ 11(1 " (10) 

i—1 i—1 

where a^{i) = a{i)h{k, c{i)) . The initial value can always be set to 0 

without loss of generality. The weights of the combination do not depend on 
the third dimension (value) of the input and neuron weight vectors, but rather 
on the gain factor and on the locations of the input and neuron. Therefore, the 
third weight of the neuron is at any time a true linear combination of the data 
presented until then: 

t t 

Wyaiit) = with A'=*(i) = a'^{i) (1 “ (H) 

2=1 

Is this linear combination optimal? We will show this in the same way as for 
kriging. We first assume that the data are realizations of a random variable 
Y. But this time, this variable is not stationary, and is the sum of an intrinsic 
random variable (the Z variable used in kriging) and a determinist linear drift 
m: 

Y (x) = Z (x) + m{x) (12) 

We can first make the following analogies with kriging notations: 

— ^ioc{i) d^(:^z) — ^vali^i) 

Xo = wl^{t) Y*{xo) = w^^iit) 

= A'=‘(z) 

With these notations, the formula of Eq. [^Ccm be rewritten: 

t 

Y*{xo)=J2x,Y{x,) (13) 

If t is sufficiently big, the second term of Eq. E3 (influence of the initial weight 
vector of the neuron) can be neglected, therefore leading to: 

t 

xo = '^XiX^ (14) 

Z=1 



Unbiasness. The expectation of the estimation error at point xq can be written: 



E[Y*{xo)-Y{xo)] = E 



’^AiZ(xi) - Z(xo) 



+ ^ Xim{xi) - m{xo) (15) 



This expression has a first probabilist term and a second determinist term. Each 
of them must be nullified to ensure unbiasness. The second term is naturally 
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null, because the drift is linear and the final location xq of the neuron is a linear 
combination of the location of the data presented (Eq. ITO : 

t t 

m(xo) = m(^ XiXi) = ^ \im{xi) (16) 

We can thus say that moving the neurons filters the drift. 

Nullity of the first term requires the same condition as for kriging: the sum 
of the Xi must be 1. Let us note At this sum. Developping At using Eq. [pleads 
to: 

t 

At = l-l[{l-a^{i)) (17) 

2^1 

At tends to 1 when t increases iff the log of the product tends to minus infinity. 
A first order development gives: 

t t 

logillil - aH^)) = - E «'(*) + ( 18 ) 

2^1 2^1 

If the gain factor a{t) follows a decreasing law of the type 1/t^ with 0 < /3 < 1, 
which is the usual convergence condition of Kohonen networks 0 , then the sum 
goes to infinity, and At converges to 1. 

Optimality. To show the optimality, that is the minimization of the variance 
of the estimation error, is much more difficult. Indeed, the neural interpolation 
algorithm never uses an explicit knowledge of the variance of the increment of 
the random variable. Therefore, an assumption is needed on how this variance 
is taken into account in the algorithm. We suppose first that the variance only 
depends on the distance between data points in the representation space of the 
map instead of the input space. Furthermore, we suppose that the neighboring 
function is a good representation of this variance: 

Co(l-Mc(*), c(j)) (19) 

where Cq is a normalisation factor. This assumption is intuitively true for very 
big data sets. Indeed, in this case, measurements were made where variations 
were expected rather than where the variable was known to be stable. The 
resulting distribution thus reflects the variability of the measures. 

The (determinist) drift naturally disapears when writing the variance of the 
estimation error. Optimality is therefore ensured under the same condition as 
for kriging (first n lines of the system of Eq. 0 : 

t 

^ = C,o Vj (20) 

i=l 

where Cjo = Cq{ 1 — h{c{j),k)), k being the considered neuron. Under the hy- 
pothesis of a unitary neighborhood, Xi is non zero only when c(i) = k, thus 
when Cij = Cjo. The sum of the Xi being 1 as shown above, this demonstrates 
optimality. 
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Estimation Error. If a model of variance of the increments is available, the 
variance of the estimation error can be iteratively computed during the learning 
process. An updating rule similar to the one of Eq. 0 



which is simply twice the variance of the estimation error defined in Eq. El 
However, as no model of variance is required for neural interpolation, we 
would prefer not to have to compute one at all. This model is an analytic re- 
presentation of the mean of the squared increments between data, function of 
their distance. Neural interpolation being a stochastic process, we propose to 
use at each time step the squared increment between the data presented and 
each neuron, instead of a model of what this value should be. This leads to the 
following new updating rule: 



C\t + 1) = - a{t)h{k, cmc’^it) - (wtiit) - (23) 



This rule allows a better understanding of the local variability of the measures. 
Outliers can be more easily detected on the resulting map, because the local 
error they produce is not smoothed, as would be the case with a general model 
of variance. The error map rather reflects the variability of the data than their 
density. 

4 Comparison 

4.1 Algorithmic Complexity 

It is important to compare the algorithmic complexity of kriging and neural 
interpolation, according to the numbers of data n, interpolation points m and 
iterations t. 

Concerning kriging, two steps are required: computation of the model of 
variance first, and estimation itself. The first step needs a constant number of 
operations a for all couples of data points. The second step consists in solving 
a system of p linear equations with p unknowns, where p is the number of data 
points considered. Remember that when the data set is big, all data cannot 
be considered to build the system of equations. We do not take into account 
the time needed to cleverly choose these p points. The resolution needs bp^ /lO 
operations, where 5 is a constant depending on the chosen method, the same 
order as a. For m estimation points, kriging complexity is thus: an^ -I- bmp^ /IQ. 

Concerning neural interpolation, two steps are also required at each time 
step. First, the distance between the considered data point and all neurons is 



+ 1) = C^{t) - a{t)h{k, c{t)){C\t) - Cto) 



(21) 



would lead to the following result: 




(22) 
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computed (we consider that we have m neurons). The cluster can be found in 
constant time, under some simple hypothesis cni. Then, all neurons are updated 
according to their distance to the cluster. These two steps require cmt operations, 
where c is the same order as a and b. 

Neural interpolation and kriging have a comparable cost if the number of 
iterations t is the same order as the maximum oir? /m and p^/10. Clearly, when 
very few data points are available, kriging is faster, while when a lot of data 
points are available neural interpolation is faster. Two numerical examples are 
given in table E The number p of points used to build the systems of equations 
for kriging is 20. Although quite low when compared to the number of data 
points available, it is often sufficient. The number t of iterations for neural inter- 
polation is chosen from our experience to allow convergence of the algorithm. The 
last column gives (an order of) the number of operations required for neural in- 
terpolation, while the one before gives this number for kriging. Remember that 
the problem of choosing the right p points for kriging is not taken into account 
to evaluate the number of operations. 



Table 1. Numerical examples of the algorithmic complexity of each method. 



n 


m 


p 


t 


# op. kriging 


op. neur. int. 


100 


1,000 


20 


1,000 


810,000 


1,000,000 


10,000 


1,000 


20 


10,000 


100,800,000 


10,000,000 



4.2 Practical Results 

The neural interpolation method has been used on synthetic and real data sets. 
The results are available in m- The synthetic data sets aimed at controlling 
the optimality of the results, and were therefore not too big. Some criteria were 
defined to assess the mean bias and optimality of the interpolations for all the 
neurons of a network. It was shown that the bias is nearly null if there are enough 
modifications of the weights, that is if the number of iterations and the gain are 
sufficiently high. Optimality is ensured with a very low relative error (less than 
3%), and requires the same conditions as nullity of the bias. The residual relative 
error must be compared with the error between the experimental variance and 
the model used in kriging, which is generally about 8%. 

Kriging has been used during the european project MEDATLAS to make an 
atlas of the temperature climatology field over the Mediterranean sea | 2 ]. The 
aim was to produce a series of maps of the temperature climatology state of the 
sea, for each month and at some 29 standard depths (that is 348 maps). The 
number of data available was related to the depth considered, from about 260,000 
in surface to less than 1,000 at the bottom. The domain to model was the whole 
Mediterranean basin on a grid of about 25 km step, which makes more than 
15,000 grid points. To achieve this goal in a reasonable computation time, the 
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kriging approach was based on an adapted meshing of the data set, that allowed 
to select the points where it was worth computing. For the chosen points, a 
universal kriging method was used, based on a regional model of variance. The 
results were then re-gridded on a regular grid using a weight-distance linear 
combination of the 4 closest estimated values. An example of temperature map 
is given in figure 13 

A real data set was taken from the MEDATLAS project, and aimed at sho- 
wing the computing time gain. The data set contained about 25,000 measures. 
The interpolation was required on a 204x72 points grid. The result was checked 
using cross validation. With this aim, the regular estimation and error grids 
were first reinterpolate to the data points using a classical bilinear interpolation. 
Then, the error between the value used to build the map and the value given by 
the map at each data point was computed. The mean error on all data points 
was found to be -0.02, which is close enough to 0 to say that the estimation 
is unbiased. Finally, the error on each point was compared with the predicted 
error. The mean value was found to be 0.82, which is close enough to 1 to say 
that the error map is coherent with the estimation. However, there can be cases 
where the estimation is very bad, although not biased, and coherent with the 
error computed. Therefore, it was necessary to make a graphical comparison 
between the maps. The map built by neural interpolation compared well with 
the MEDATLAS one (figure 0 • The neural interpolation method required less 
than 1 minute of computing time, while the method used in MEDATLAS (which 
hopefully limited the number of points where to interpolate with the help of an 
adapted meshing) required more than four hours. 

5 Conclusion 

An original method for fast interpolation of big data sets is presented in this 
paper. The method relies on some basic modification of the standard Kohonen 
algorithm. Its unbiasness and optimality are demonstrated under hypothesis si- 
milar to those used in kriging. The method has been applied on several synthetic 
and actual data sets. In every cases, the results compare perfectly well with those 
obtained by kriging. However, the method is much faster than kriging when the 
data set is large, which is practically the case for actual problems. Future work 
will deal with taking into account an error on each data point, what kriging can 
do. Other studies will deal with the use of the Kohonen algorithm for data fusion 
and assimilation. 
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Fig. 3. MEDATLAS map of May at 10 m immersion (3- 
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Fig. 4. Map of the Med data set produced by nenral interpolation. 
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Abstract. In this paper we present a novel approach to information 
hiding. We investigate the possibility of embedding information using 
in some way the naturally pseudorandomness of some classes of cover- 
documents. In particular we provide algorithms for embedding any binary 
string in an image belonging to a particular class of images, the image 
mosaics. The algorithms presented allow different levels of security for 
the information hidden in the cover-document. We also show some tech- 
niques to reduce the amount of information the users have to secretly 
store. 



1 Introduction 

One of the main differences between cryptography and steganography is that 
each cryptographic algorithm maps a document, the plaintext, into a ciphertext 
that must be pseudo-random. Indeed it can be shown that if an encryption 
scheme is deterministic then it cannot be considered secure (see |2| for a more 
detailed description). We notice that we do not make any assumption about the 
document that will undergo the encryption, in the sense that we can encrypt a 
text as well as a binary file, that already “seems” pseudo-random. 

On the other hand, each steganographic scheme hides information, the embed- 
ding-string into a document, the cover-document, without relevantly altering the 
information contained in it. This means that if we hide information in a non- 
random document, e.g. a text, the resulting document must be a a non-random 
document while if we hide information in a pseudo-random document, e.g. a 
compressed file, the result must be pseudo-random, e.g. a compressed file. In 
particular, the output document of the steganographic scheme must contain al- 
most the same information of the input one. 

These difference could be actually seen as the impossibility of using any 
cryptographic primitive in steganographic schemes. 

The basic observation we make in this work is that we can use a pseudorandom- 
document as input of an encryption scheme and as cover-document in a stegano- 
graphic scheme and both the schemes must return a pseudo-random document as 
output. The major problem is that steganography wants to hide to an attacker 
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the fact that an information is hidden in a document too. So, if the attacker 
sees that Alice sends to Bob a pseudo-random document, he will immediately 
detect that there are some hidden information in it. This could be avoided by 
properly choosing the cover-document to use. In ^ the authors show how to 
embed any file in a raid of pseudo-random cover-files, without relevantly alte- 
ring their pseudo-randomness. On the other hand, we notice that the authors 
“forced”, in a certain sense, the cover-documents to be pseudo-random. Indeed, 
no disk drive stores the information by randomizing them. This means that an 
attacker can immediately detect that some information are hidden in the disks. 

Thus, we are searching for a class of documents that are “naturally” pseudo- 
random and such that their non-randomness is immediately recognizable or, in 
any case, checkable. This means that if we use a compressed file as cover-file, 
the output of the embedding function must be a compressed file too, i.e. there 
must exist some (public) program that correctly uncompress it. Usually, when 
dealing with images, the “public program” is the human visual system. 

A lot of papers presents results in steganography using images as cover docu- 
ments. The schemes presented in these works, embeds information into images 
by slightly modifying them in such a way that the human eye cannot see the dif- 
ference. This is done by modifying some information in the image representation, 
or in its transformed representation. 

Much work in the image steganography has been done on digital watermar- 
king. The interest in these techniques is due to the growing need of copyright 
protection in the internet. The goal of image digital watermarking is to embed 
some private information into an image. This embedding procedure must satisfy 
two main requirement. First: the embedding procedure should not relevantly 
alter the original image. Second: there must exist a function that checks (or re- 
trieves) that some information have been embedded in an image. Watermarking 
techniques must also guarantee the impossibility of changing the information em- 
bedded in an image without relevantly altering the image itself. This property 
actually states for the impossibility of changing the “ownership” information 
that the embedded information carries. The latest property is actually hard to 
asses since the images can undergo a lot of manipulation, like scaling, resizing, 
cropping, and so on. 

One of the simplest method to hide information in an image is to alter in some 
way the least significant bits in a bit plane representation of the image. Examples 
of this techniques are HHn^. However, these LSB methods are vulnerable in the 
sense that unauthorized parties could simply recover the information hidden in 
an image. 

Due to this problem the authors in mm developed some techniques that 
require the original cover-image for the retrieving phase of the information. 

Another approach has been used by the authors in ( 1 4l4j . The technique 
described in these papers selects some pixel of the image using a pseudo random 
generator, and alter in some way their luminance. 
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In this paper we consider a particular class of pseudo-random images the 
Image mosaics described in the next section. We shall show that it is possible to 
use this class of images as cover-documents in a steganographic scheme. 

The paper is organized as follows: In Section |2| we describe image databases 
and photomosaics. In Section |5| we give the algorithms for embedding and ex- 
tracting information in an image mosaics. In Section 0 we present future works 
to be done. 

2 Preliminaries and Notations 

In this section we briefly review the terminology that we will use through this 
paper. 

2.1 Image Databases 

With the growth of world wide web, people have now access to tens of thousands 
of digital images, that can be collected in large databases. One of the basic 
problems in this case is to retrieve particular images from these databases. 

According to the classic automatic information retrieval paradigm, a data- 
base is organized into documents that can be retrieved via synthetic indices, in 
turn organized in a data structure for rapid lookup and retrieval. The user for- 
mulates his information retrieval problem as an expression in a query language. 
The query is translated into the language of indices, the resulting index is mat- 
ched against those in the database, and the documents containing the matched 
indices are retrieved. 

One possible solution to the indexing problem should be to identify each 
image with one or more keywords. However, such an approach should avoided 
for some reasons. First of all, the classification should be done interactively, and 
this will not be, of course, a good strategy for large databases. This is not the 
only problem. As pointed out by the authors in |1 fl| . describing some pictorial 
properties of some images should be difficult, and, at the same time, some other 
properties should be described in different ways. 

In the last years have been developed some techniques that allow the auto- 
matic indexing of image databases using “keywords” related to the represented 
images. Roughly, these techniques extract some information from the image and 
use these information as “keywords” for image indexing and lookup. A simple 
example of automatically extracted keyword is the hash value of the image. Of 
course, in these cases, the query that the user will formulate is an image itself. 
For more detailed description of content-based image retrieval systems see PI 
F^im | 

2.2 Mosaics of Images 

A classical mosaic is a set of small coloured stones arranged in such a way that 
if they are seen from away, they compose a larger image. This artistic expression 
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is based on the property of the human visual system that “sees” the colour of a 
region as the average colour of that region. Indeed, if the distance between two 
points is below a certain threshold, the human eye sees only one point whose 
colour is the average colour of the original ones. 

The Photomosaic^ created by R. Silvers are based on the same property 
(see m for a more detailed description). Photomosaics are mosaics in which 
the coloured stones are substituted by small photos. Silvers wrote a computer 
program that, starting from a database of small photos, called tiles images, cata- 
logs these images according to some characteristic, like colour, shape, contrast, 
and many other. To create the photomosaic of a target picture, the software 
divides the target into macro-pixel and then substitutes each of them with an 
image in the database that best matches its characteristic. An example of Pho- 
tomosaic0 is presented in Figure ^ 




Fig. 1. Photomosaic 



Other examples of image mosaics can be found in 0. In this paper the 
authors describe a process to automatically generate image mosaics. The major 
difference between the Silvers’ photomosaics and the image mosaics presented 
in is the size of the tiles. An example of image mosaic is given in Figure 0 
Notice that currently the generation of the image mosaics is not completely 
automatic. This class of images is considered as an artistic expression. Thus, 

^ Photomosaic is a trademark of Runaway Technology. 

^ Image by courtesy of Runaway Technology. 
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Fig. 2. Image Mosaic 



the authors automatically generate a first “draft” of the image mosaic and then 
modify it by hand in order to meet some personal criteria. 

Another crucial point for automatic generation of the image mosaics is the 
size of the tiles. Indeed, if the size of tiles grows too much with respect the size of 
the image, the mosaic generated could not faithfully represent the target image 
(e.g., the target image resolution will decrease). 

3 The Schemes 

3.1 A General Scheme 

In the previous section we have described what a mosaic of photos is. Here we 
will describe a scheme that allows to embed any secret information in such an 
image. For a sake of simplicity, we suppose that the mosaic A1 is a black and 
white picture, i.e., each tile of the mosaic can be seen as encoding a black pixel 
or a white one. We will later describe a generalization of this technique to the 
case of coloured mosaics. 

More formally, we can see a mosaic A4 as an array of n tiles images. Mi, . . . , 
M„. Each picture Mj can be classified as belonging to one of the two classes 
of images Cq or Ci, where Cq (resp., Ci) contains images that encodes a white 
pixel (resp., a black pixel). We suppose that Alice and Bob have agreed upon the 
two classes of tiles, i.e., they both hold these two databases of images and they 
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both hold a classification algorithm Class, that on input Mi, outputs an index j 
such that Mi G Cj . For the sake of simplicity, suppose that these databases have 
the same size s = 2‘, for some t (we will relax this assumption later). Moreover 
Alice and Bob have a standard ordering rule on each class of images, i.e., there 
exist four functions, known to both the players, fj : {0, 1, . . . , s — 1} Cj, the 
embedding function, and gj : Cj — >■ {0, 1, . . . , s — 1}, the extracting function, with 
j = 0, 1, such that for each picture P in Cj, it results fj{i) = P 4=^ 9j{P) = *• 
Let X be the message Alice wants to send to Bob and let x\,. . . ,xi be its 
binary representation, where £ = vt for some integer v < n. In the following, 
Int, is a function that takes in input a binary string b and returns as output an 
integer whose binary representation is b, and Bin is its inverse function. Alice 
will use the algorithm presented in Figure El 



Algorithm Embed(A4,A’) 

1. Partition xo, ■ ■ ■ , X£-i in v blocks. Bo , . . . , B„-i, of t bits each, i.e., 

Bi = XiiXti-\-l . . • , Xti-\-t — l 

Notice that each block is the binary representation of an integer in {0, 1, . . . , s}. 

3. for i = 1 to V 

3.1 Let j = Int(Bi) 

3.2 Let c = Class(Mi) 

3.2 Substitute Mi with fc{j)- 

4. Output A4 



Fig. 3. The Embedding Algorithm. 



It can be easily seen that information that the mosaic represents, i.e., the 
image encoded in the mosaic, does not change, since the transformation Alice 
applies to the mosaic simply substitute a picture of the mosaic with another 
picture of the same class. It could be roughly thought as changing a bit zero in 
a binary string with another bit equal to zero too. 

The algorithm Bob has to use to “decode” the mosaic is the presented in 
Figure 0 

It is immediate to see that if u < u, then Bob will decode the message X with 
some random elements R appended to it. To avoid this, Alice could append to 
X some special termination symbol. Moreover, notice that the scheme presented 
is deterministic. This means that, if X = 0^, where £ = vt, then the scheme will 
substitute the first v tiles in the mosaic with the tiles encoding 0* in Cq or Ci, 
depending on the class the substituted tile belongs to. Of course, this situation 
should be avoided because, if u < n the resulting mosaic will immediately reveal 
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Algorithm Extract(Al) 

1. for i = 1 to n 

1.1 Let c = Class(Mi) 

1.2 Let e = g^{Mi). 

1.3 Let Bi = Bin{e) 

2. Output A = Bi, ..., i?„. 



Fig. 4. The Extracting Algorithm. 



to an attacker that something strange is going on. In Sections tl.:il and 13.41 we 
will show how to solve this problem by randomizing the embedding scheme. 

We now to evaluate the maximum length of X that it is possible to embed in 
a mosaic. We suppose that the mosaic is composed by n tiles and that the sets 
Co and Ci contain s images. Since each picture “encodes” log s bits of df, we can 
embed in mosaic nlog s bits. We call this quantity the capacity of the mosaic Ai 



3.2 Implementing the Function / and g 

It is not hard to see that one of the basic ingredient of the algorithms are the 
embedding and extracting functions / and g. Recall that these function are 
defined as follows: 



fj : {0, 1 , . . . , s — 1} — >■ Cj gj : Cj — >■ {0, 1 , . . . , s — 1} 

A simple way to implement these functions is to “embed” them in the image 
database by adding to each image a field containing a (unique) number in 
{0,l,...,s — 1} in a preprocessing phase. 

This simple process assures that to each image in a certain class is associated 
an unique identifier that ranges from 0 to s — 1. Thus the /’s implementation 
actually is a query to C with key equal to the input of /. On the other hand, 
the g^s implementation is a query to C too. The difference between these two 
requests is that for the function g, the query is an image. 

3.3 Hiding Private Information 

In the previous sections we have shown that it is possible to embed any binary 
string of length f in a mosaic composed by n pictures, if holding at least two 
classes of images with s images each and if i is upper bounded by nlogs. The 
secrecy is guaranteed by the fact that an attacker should not know the database, 
the embedding and extracting functions. We have also shown how to simply 
implement these function by “embedding” them in the database itself. 
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This technique forces the users to locally (and privately) maintain a copy of 
the image databases. Notice that, there should exist some embedding and extrac- 
ting functions that do not need the database modification to be implemented, 
(the functions should extract information from the directly from the images). If 
this is the case, the database could be public, but, for the privacy of the scheme 
presented in the previous section, the functions must still be private. 

We put ourselves in a weaker condition in which the functions are imple- 
mented using some characteristic of the single image, i.e. do not depend on 
the database, and, at the same time, are public. To assure the privacy, we will 
actually embed in the mosaic an encrypted form of the message. 

More precisely, an encryption scheme is a pair (£", T>) of probabilistic, polyno- 
mial time algorithms where £ is the encryption algorithm and T> is the decryp- 
tion algorithm. Recall that, for a probabilistic encryption scheme to be secure, 
the ciphertexts obtained by different encryption of the same plaintext must be 
different. Moreover, the ciphertexts obtained from the encryption must be com- 
putationally indistinguishable from a random string (see 1 The encryption 
(resp., decryption) algorithm takes as input an encryption (resp., decryption) 
key k and the plaintext (resp., the ciphertext) and outputs the ciphertext (resp., 
the plaintext). 

Let X = xi, . . . ,a;^ be the message Alice wants to send to Bob. The users 
will exchange the message X using the algorithms presented in Figure 0 



Algorithm Secure_Embed(A4,A’,k) 

1. y = £{k,X) 

2. Output Embed(At,T) 



Algorithm Secure_Extract(A4,k) 

1. T=Extract(A4) 

2. Output T>(k, T) 



Fig. 5. Secure Embedding and Extracting Algorithms 



Even though the database and the functions are public, an attacker can ex- 
tract y (a pseudo-random string) using the Extract algorithm. If the encryption 
scheme is secure, than it is impossible for the attacker to infer information about 
the message X we have embedded in the image mosaic. Moreover, since y looks 
completely random, the attacker cannot tell if some information has been hidden 
into the mosaic. Using this approach, of course, the users must share a common 
key k and this is the only information that the users must keep secret. 

3.4 Enhancing the Secnrity of Embedding Scheme 

The protocols presented in the previous sections embeds information in an image 
mosaic A4 =Mi,M 2 , . . . ,M„ by substituting the first v tiles in the image, for 
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some V < n. This means that, if an attacker believes that some information has 
been embedded into a mosaic, he is sure that these information are hidden in 
these tiles. 

We have also shown how to embed private information by encrypting the mes- 
sage before the modification of the image. The protocol presented, in this case, 
does not modify the sequence of tiles changed by the embedding algorithm, even 
if, the amount of information that the users keep secret dramatically decreases. 

The security of the scheme presented in Section It., II is based on the security of 
the underlying encryption scheme. Another simple way to improve the security of 
the scheme presented, is by allowing the algorithm to select the tiles to change in 
a pseudo-random way. As the receiver must be able to extract the information 
embedded in the image mosaic, the users should share a second key, k 2 ad a 
random seed r, that will be used in a pseudo-random number generator. This 
will assure the correctness of the decoding. 



Algorithm Random_Embed(A4,d:',r,fei,fe2) 

1. Let y — £{ki, X). 

2. Partition y in v blocks, Bi , . . . , By, of t bits each, padding zeros if necessary, i.e., 

Bi = yuyti+i . . . , yti+t—i 

3. Let Used=0, i=0. 

4. Let Q = gi, 52 , • . . , 5m be a sequence of m bits generated by a pseudo random 
number generator on input key fe 2 and random seed r. 

5. while i < V 

5.1 Let b be the first [logn] unused bits in Q. 

5.2 Let j = Int{b) 

5.3 If j ^Used 

5.3.1 Substitute Mj with fciaas{M^){Bi). 

5.3.2 Used=Used U{ji}. 

5.3.3 i <— i + 1 

6. Output A4 



Fig. 6. The Embedding Algorithm with Pseudo Random Selection. 



In this way, even if the attacker knows the database of the images, the em- 
bedding and the extracting function, he cannot extract any information from 
the image mosaic since he does not know which is the sequence of the elements 
he has to decode. 

Notice that the algorithms presented in Figure 0 can be combined with the 
one presented in this section to obtain a more “powerful” scheme, presented in 
Figure 0 in which the user embeds the encrypted form of the message in the 
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image mosaic using randomly selected tiles of the original mosaic. The extracting 
scheme can be easily obtained by the previous one and is thus omitted. 



3.5 A Note on Coloured Mosaics 

In the previous sections we have discussed the embedding of private informtion 
in image mosaics and we have assumed that the original image is a black and 
white one. 

However, this requirement is not necessary at all. Indeed it is possible to 
extend the embedding schemes presented to work with coloured mosaics. 

The scheme will consider d > 2 classes of images Co, . . . ,Cd-i and will sub- 
stitute a tile in the mosaic with another one belonging to the same class, as 
described in the previous sections. 



4 Future Work 

In the previous sections, we have analyzed some techniques allowing to embed 
information in image mosaics. We have based our approach on an existing class 
of images for which there already exists software for automatic generation. 

In this section we point out the possibility to use the same techniques pre- 
sented in Section 0 on a different class of cover-images. The scheme we present 
here does not need any database private memorization. 

We now present the scheme for B/W images. Consider the cover-image as 
an n X m-bit matrix and suppose, for a sake of simplicity, that n = wh and 
m = wv, for a fixed w and for some v and h. It is thus possible to consider 
original cover-image as a, h x v matrix of macro-pixel, each of which is a w x w 
matrix of pixels. A macro-pixel is said to encode black if more than w'^ /2 pixels 
are black, otherwise it is said to encode white. 

We can construct a “database” of macro-pixel in the following way: 

— A white macro-pixel is a w x w matrix of pixels in which all but one pixels 
are white; 

— A black macro-pixel is a w x w matrix of pixels in which all but one pixels 
are black. 

Each macro-pixel, can be associated to a integer in the set {0, 1, . . . , — 1}. 

In Figure 0 we report a possible construction for the case of w = 2 with the 
associated integer for each macro-pixel. 

Notice that not all the values of w are admissible. Indeed, as each macro- 
pixel encodes a value in {0, 1,...,w^ — I}, t = [logw^] bit are necessary to 
represent this value. On the other hand, there exists no macro-pixel that encode 
the value in the set + 1, ... ,2* — 1}. This means that, if we use t bits for 

the embedding, there exist some binary string that cannot be injected into the 
mosaic. If we encode t — 1 bit in each macro-pixel, there will be some macro-pixel 
that will be never used. This means that w must be equal to some power of 2. 
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Fig. 7. Macro-Pixel Construction 
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Indeed, in this case, = 2® and thus we can encode in each macro-pixel exactly 
s bits. 

It is important to remark that w’s growth corresponds to a degradation of 
the image. We have presented a macro-pixel construction in which all but one 
pixels have the same colour. Of course it is possible to think to different macro- 
pixel constructions. For example it should be possible to require that all but two 
pixels have the same colour. This will, of course, raise the number of bits each 
macro-pixel encodes, but the image quality could decrease. 

5 Conclusion 

In this paper we presented a novel approach to data hiding in images. We have 
identified a class of images that is particularly indicated for this task. We also 
given some algorithms for for information hiding that allow different levels of 
security. The algorithms presented can be composed in order to meet the security 
level required. 

While the algorithm presented in Section 0 are secure and do not alter the 
information the cover-image represents, the idea presented in Section 0 should 
be verified in order to give an experimental result on the image degradation. 
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Abstract. Graphs are models of communication networks. This paper 
applies combinatorial and symbolic-analytic techniques in order to cha- 
racterize the interplay between two parameters of a random graph: its 
density (the number of edges in the graph) and its robustness to link 
failures, where robustness here means multiple connectivity by short dis- 
joint paths. A triple (G, s, t), where G is a graph and s, t are designated 
vertices, is called i — robust if s and t are connected via at least two 
edge-disjoint paths of length at most £. We determine here the expected 
number of ways to get from s to t via two edge-disjoint paths in the 
random graph model Gn,p- We then derive bounds on related threshold 
probabilities Pn,e as functions of £ and n. 



1 Introduction 

In recent years the development and use of communication networks has increa- 
sed dramatically. In such networks, basic physical architecture combined with 
traffic congestion or operating system decisions, result in a certain, dynami- 
cally changing geometry of the graph of interconnections. We adopt the random 
graph model of G„,p (see [^, to capture link availability with probability p 
(independently for each link) in a fully connected graph. Even in such a simple 
network model, it is interesting to investigate the trade-off between its density 
(the number of edges) and its robustness to link failures. The existence of alter- 
native paths in such graphs models desired reliability and efficiency properties, 
such as the ability to use alternative routes to guide packet flow in ATM net- 
works or even improve the efficiency of searching robots on the World Wide Web, 
in the sense of an increased multiconnectivity of its hyperlink structure. 

Given a triple (G, s,t), where G is a G„^p random graph and s,t are two of 
its nodes, a natural notion of robustness is to require at least two edge disjoint 

* This work was partially supported by the EU Projects ESPRIT LTR ALCOM-IT, 
IST-FET-OPEN ALGOM-FT, IMPROVING RTN ARACNE and the Greek GSRT 
Project PENED-ALKAD. 
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paths of short length (say, exactly £ or at most i) between s and t, so that 
connectivity by short paths survives, even in the event of a link failure. We next 
give the following formal definition of ^ — robustness. 

Definition 1 (.6- robustness). A random graph G of the model Gn,p is £ — 
robust for two vertices s, t when there are two edge-disjoint paths of length at 
most £ between s,t in G. 

In this work, we investigate the expected number of such paths between two 
vertices of the random graph, as well as bounds for the threshold probability 
Pn,i (as functions of £ and n) for the existence of such paths in the Gn,p random 
graph G. 

Although Gn,p has been extensively studied (see P, 0, some questions 
of existence of multiple paths, which are vertex or edge disjoint between specific 
vertices have not been investigated till recently. The theory of random graphs 
began with the celebrated work of Erdos and Renyi (0) in 1959 and by now 
researchers know lot about the probable structure of these objects (see, e.g., the 
birth of the giant connected component in El)- In this context we remark that 
the question of existence of many vertex disjoint paths of small length has been 
investigated by Nikoletseas et al in m- 

With respect to the corresponding question of the existence of many edge 
disjoint paths, we refer to the fundamental work of Broder et al ( 0 ). In that 
work, the authors show the existence of many edge disjoint paths in dense (with 
high probability connected) random graphs. However, the above work does not 
address the question of the existence of many edge disjoint paths of a certain 
length. Also, the estimation of the precise number of such paths (as a function 
of the density of the graph) still remains open. 

It turns out that even the enumeration of paths among the vertices 1 and n 
that avoid all edges of the graph (1, 2 . . . , n) but pass through its vertices, is a 
non-trivial task. In fact, such an enumeration is a special case of enumerating 
permutations (cti, (T 2 , ■ ■ • , o-„) of (1,2,..., n) where certain gaps ai+\ — Ui are 
forbidden. In our case, — <Ji must not be in the set { — 1, 1}. 

In this work, we provide a precise estimate of the expected number of un- 
ordered pairs of paths in a random graph that connect a common source to a 
common destination, and have no edge in common, though they may share some 
nodes. Thus, for any given set of values of n,p,£ (where £ represents the length) 
we estimate precisely the mean number of avoiding pairs in graphs of a given 
size. This leads to a tight bound for the probability of non-existence of such 
paths between any fixed pair of nodes and also to a bound for the probability of 
the existence of many pairs of edge disjoint paths of length £. 

In order to achieve this, we devise a finite state mechanism that describes 
classes of permutations with free places and exceptions. The finite-state descrip- 
tion allows for a direct construction of a multivariate generating function. The 
generating function is then subjected to an integral transform that implements 
an inclusion-exclusion argument and an explicit enumeration result; see Theo- 
rems n and 0 This enables us to quantify the trade-off between ^-robustness 
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(as defined above) and the density of the graph (i.e. the number of its edges). 
The originality of our approach consists in introducing in this range of problems 
methods of analytic combinatorics and recent research in automatic analysis (ba- 
sed on symbolic computation). For context, see 0) 0) 0- Additional threshold 
estimates regarding properties of multiple source-destination pairs are discussed 
in the last two sections of the paper. 

Summary of results: From earlier known results (i, m) and this paper, 
a picture of robustness under the Gn,p model emerges. (As usually in random 
graph theory, various regimes for p = p(n) are considered). Start with an initially 
totally disconnected graph, corresponding to p = 0. As p increases, the graph 
becomes connected near the connectivity threshold Pc{n) — (logn)/n. All the 
following results hold for any integer i: 6 < i < n. Any fixed s, t pair (or 
equivalently a random s, t pair, given the invariance properties of Gn,p) is “likely” 
to be i-robust when p crosses the value 

Pn,e{n) = 2^ 

(see TheoremEland Equation 0). (Here “likely” means that the expected number 
of edge-disjoint pairs is at least 1 when n — >• oo). As long as p < pL{n,t), where 

PL{n, 1) = ^log 

we know, w.h.p., the existence of s,t pairs that are not connected by short (of 
length at most £) paths; see TheoremEl (The function pL{n) is in fact a threshold 
for diameter). However, one only needs a tiny bit more edges, namely p > pjj: 

Pu{n,£) = 2 (log (n^ log n) ) '^ 

to ensure that almost all s,t-pairs are Arobust; see Theorem 0 



2 The Enumeration of “Avoiding” Configurations 

The problem at hand is that of estimating the expected number of “avoiding 
pairs” of length £ between a random source and a random destination in a G„,p 
random graph G. (An avoiding pair of length £ means an unordered pair of paths, 
each of length £, that connect a common source to a common destination, and 
have no edge in common though they may share some nodes) . This problem first 
necessitates the solution of enumeration problems that involve two major steps: 

— Enumerate simple paths called “avoiding permutations” of length £ that can 
be viewed as hamiltonian paths on the set of nodes [1, . . . , £ -I- 1], connecting 
the source 1 and the destination £ -I- 1, and having no edge of type [i,i + 1] 
or [i , i — 1], 
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— Enumerate so-called “avoiding paths” , that are simple paths allowed to con- 
tain outer nodes taken from outside the segment [1 , This situation 
is closer to the random graph problem since it allows nodes taken from the 
pool of vertices available in the graph G G Gn,p. 

The first problem is of independent combinatorial interest as it is equivalent 
to counting special permutations with restrictions on adjacent values. It also 
serves as a way to introduce the methods needed for the complete random graph 
problem. Both problems rely on the inclusion-exclusion principle that is fami- 
liar from combinatorial analysis and counting by generating functions (GF’s). 
Applications of these results to the Gn,p model are treated in the next section. 



2.1 Symbolic Enumeration Methods 

We use here a symbolic approach to combinatorial enumeration, according to 
which many general set-theoretic constructions have direct translations over ge- 
nerating functions. A specification language for elementary combinatorial objects 
is defined for this purpose. The problem of enumerating a class of combinato- 
rial structures then simply reduces to finding a proper specification, a sort of 
a formal grammar, for the class in terms of basic constructions. (This general 
method has been carefully explained in the fundamental work of F. Chyzak, Ph. 
Flajolet and B. Salvy ([5])). 

In this framework, classes of combinatorial structures are defined either ite- 
ratively or recursively in terms of simpler classes by means of a collection of 
elementary combinatorial constructions. The approach followed resembles the 
description of formal languages by means of context-free grammars, as well as 
the construction of structured data types in classical programming languages. 

The approach developed here is direct, more “symbolic”, as it relies on a 
specification language for combinatorial structures. It is based on so-called ad- 
missible constructions that have the important feature of admitting direct trans- 
lations into generating functions. We specifically examine constructions whose 
natural translation is in terms of ordinary generating functions. 

The ordinary generating function (OGF) of a sequence {A„} is, we recall, 

Mz)=EZoAn-Z-. 

Definition 2 (Admissible Constructions). Assume that <P is a binary con- 
struction that associates to two classes of combinatorial structures B and C 
a new class A = <P{B,G) in a finite way (each An depends on finitely many 
of the B„ and Gn). The <P is admissible iff the counting sequence {A„} of 
A is a function of the counting sequences {B„} and {G„} of B and G only: 
{An} = Ei[{Bn}, {Gn}]. In that case, there exists a well defined operator T rela- 
ting the corresponding ordinary generating functions: A{z) = T[B{z),G{z)]. 

In this work, we will basically use three important constructions: union, product 
and sequence, which we describe below. 
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Definition 3 (Union Construction). The disjoint union A of two classes 
B,C is the union (in the standard set-theoretic sense) of two disjoint copies, 
B° and C° , of B and C. Formally, we introduce two distinct “markers” ei and 
£ 2 , each of size zero, and define the (disjoint) union A = B C of B,C by 
B C = ({ei} X B) U ({£ 2 } X C). The ordinary generating function is clearly 
A{z) = B{z) + C{z). We represent the disjoint union construction by Union. 

Definition 4 (Product Construction). If construction A is the cartesian 
product of classes B and C {A = B x C), then, considering all possibilities, 
the counting sequences corresponding to A,B,C are related by the convolution 
relation: A„ = ’ ^n-k and the ordinary generating function is clearly 

A{z) = B{z) ■ C{z). We represent the product construction by Prod. 

Definition 5 (Sequence Construction). If C is a class of combinatorial 
structures then the sequence class Q{C) is defined as the infinite sum Q{C} = 
{c} + C + (C X C) + • • • with £ being a “null structure”, meaning a structure 
of size 0. The null structure plays a role similar to that of the empty word 
in formal language theory and the sequence construction is analogous to the 
Kleene star operation (C*). The ordinary generating function is clearly given by 
A{z) = 1 + B{z) + B^{z) + • • • = where the geometric sum converges in 

the sense of formal power series since \z'^]B{z) = 0. We represent the sequence 
construction by Sequence. 



2.2 Avoiding Permutations 



An avoiding permutation of length £ is a sequence r = [ti , T2, . . . , r^, r^+i] that is 
a permutation of [1, ...,£+ 1] and that satisfies the following conditions: ti = 1, 
Ti+i = I 1, and Ti+i — Ti ±1 for all i such that 1 < t < ^. Clearly, such a 
permutation encodes a path from 1 to f + 1 that has no edge in common with 
the graph (1, 2, ...,£+ 1). The parameter £ + 1 is referred to as the size. There 
is no avoiding permutation for sizes 2, 3,4,5. Surprisingly, the first nontrivial 
configurations occur at size 6, where the 2 possibilities are [1, 4, 2, 5, 3, 6] and 
[1,3, 5, 2, 4, 6], while for size 7, there are 10 possibilities 



[ 1 , 3 , 6 , 4 , 2 , 5 , 7 ], [ 1 , 3 , 5 , 2 , 6 , 4 , 7 ], [ 1 , 4 , 6 , 2 , 5 , 3 , 7 ], [ 1 , 4 , 2 , 6 , 3 , 5 , 7 ], [ 1 , 5 , 3 , 6 , 4 , 2 , 
[ 1 , 5 , 3 , 6 , 2 , 4 , 7 ], [ 1 , 5 , 2 , 4 , 6 , 3 , 7 ], [ 1 , 4 , 6 , 3 , 5 , 2 , 7 ], [ 1 , 6 , 3 , 5 , 2 , 4 , 7 ], [ 1 , 6 , 4 , 2 , 5 , 3 , 



7], 

7|. 



The goal in this subsection is to determine the number of avoiding per- 
mutations of size n (that is, of length n — 1). We prove: 

Theorem 1. Avoiding permutations have ordinary generating function 



Q{z) :=J2 q„z^ = 



( 









(z-l)(l + 0) 
z (^z -I z hypergeom + l) 
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where Ei is the exponential integral and hypergeom represents the hypergeometric 
series. Equivalently, Qn is expressible as a double binomial sum: 



Qn +2 = (-1)”-^ + E (n - fcl - k,)l(^ ~ ^ 

k2=0 fci=0 1 / \ 2 / 



Proof. By the inclusion-exclusion principle, we need to determine the number 
Fn,j of permutations [t\ = 1,T2, . . . ,T„_i,Tn = n], with at least j “exceptions”, 
among which j distinguished, that are successions of values of the form tj — 
Tj-i = ±1. The number of permutations with no exception is then: 



n— 1 

Q„ = E(-l)'^nO (1) 

A permutation with exceptions can be regarded as including a subcollection 
of “exceptional” edges that belong to the graph with edges (1, 2), (2, 3), . . . , (n — 
l,n). If we scan from left to right and group such exceptions by blocks, we get 
a template; a template thus represents a possible pattern of exceptional edges. 

A template can be defined directly as made of blocks that are either: (i) 
isolated points; (ii) contiguous unit intervals oriented left to right (LR); (Hi) 
contiguous unit intervals oriented right to left (RL). There is the additional 
constraint that the first and last blocks cannot be of type RL. For instance, 
for n = 13, the template [[1,2,3], [4], [5,6], [7], [8], [11,10,9], [12,13]] will 
correspond to any permutation that has successions of values (in the cycle tra- 
versal) 1,2; 2,3; 5,6; 11,10; 10,9; 12,13 as distinguished exceptions to the 
basic constraint of avoiding permutations. 

We next provide the combinatorial specification for avoiding permutations. 

Let {a, b} be a binary alphabet. We now describe the grammar of templates. 

The collection of strings beginning and ending with a letter a is described by 
the following rule: 



spO := S = Prod(Sequence(Prod(a, Sequence(b))), a) 

(It suffices to decompose according to each occurrence of the letter a) . Now, 
the thre types of blocks in a template are described by the following rules: 

spt := Prod{begimblockP, Z, endMockP) 

sp2 := Prod{begin_blockLR, Z, Sequence{Prod{muJength, Z), 1 < card), end_blockLR) 
sp3 := Prod(beginJ)lockRL, Sequence{Prod{muJength, Z), 1 < card), Z, cndMockRL) 

Clearly, sp2 and sp3 are combinatorially isomorphic. For reasons related to 
the application of the inclusion-exclusion argument, we keep track of the size 
(number of nodes I = 1 of the basic interval graph, denoted by card) as well as 
of the length of blocks and of their LR or RL character, denoted by muJength. 
Then, the grammar of templates is completed by substituting into spO 



a = Union(spl, sp2) and b = sp3 
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Let Fn^k,j be the number of templates with size n, k blocks and j exceptional 
edges. Then, counting the number of ways of linking blocks together, yields: 



Fnj — ^ ^ 4^{k^ 

k 

where 4>(k) is the Gamma integral: 

m = {k-2)\= r 

Jo 



( 2 ) 



e"“ 



du 



for k > 2, and = 1 (since any such linking is determined by an arbitrary 
permutation of the k — 2 intermediate blocks) . Observe that the extension of (f) 
by linearity to an arbitrary series h(u) in u is given by 



</>(b(u)) = 



e “(h(u)-(u-u2)(|_h(u))^^^ 



That is to say, we just replace in expansions u — > and apply the Euler integral 



F'^u^du = k\ 



Thus, with F{z,u,v) = J2^n,k,i z^u^v‘, the OGF Q( 2 ;) = satisfies 

Q{z) = <l){F{z,u,-l)) 

Thus, from m and m above, everything boils down to obtaining the F^^kj- 



Template enumeration. The approach to determining the sequence Fn^kj con- 
sists in introducing the trivariate GF, which immediately results from the above 
combinatorial specification. 

F{z,u,v) = ^ Fn,k,e z'^u^v^ 

n,k,i 

There, z records size, u records the total number of blocks (needed for subsequent 
permutation enumerations since blocks should be chained to each other), and 
V records the total length of LR or RL blocks (the number of distinguished 
exceptions needed for inclusion-exclusion). 

We now carefully employ the generating functions for the union, product and 
sequence constructions in the grammar rules of the combinatorial specification 
for avoiding permutations defined above. 

The set of words made of o’s and b’s that start and end with an a is described 
symbolically by 

( 3 ) 

^ l-b 

This is because (1 — f)~^ = t + f + P + + ■ ■ ■ generates symbolically all 

sequences of objects of type /. Thus, W represents a sequence of objects of type 
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that start with an a. On the other hand, represents a sequence of objects 
of type b that end with an a. Take now the three types of blocks: isolated, LR, 
and RL. The GF’s are, respectively, z, LR{z) = z^/(l — 2 ), RL{z) = 2^/(1 — z). 
This is because isolated points are always of size 1, while LR and RL objects 
must be of size at least 2 (we have thus to multiply with z"^). Since the first and 
the last blocks can only be isolated points or LR blocks, the univariate GF for 
blocks is obtained by substituting a by z + LR (isolated point or LR block) and 
b by RL in W. Thus we get the following tri variate GF: 



-1 



F{z,u,v)= 1- 



1 — VZ , 



1 - 



1—VZ 



VZ 



\ — VZ 



uz{—l + VZ + uz^v) 

1 — 2vz — uz + + v'^z'^u 



Path counting. For the inclusion-exclusion argument, it is easy to observe that 
the desired sum ^ Fn,k,i z”u^(— 1)^ corresponds to the specialization 
F{z,u,—l). This yields: 



F(z u -1) = uz{-l-z-uz^) 

^ 1,0. , ,2 _i_ ^3, 



(4) 



1 -I- 2z — uz -I- z^ -I- z^u 
Application of the ^transformation (that counts the number of ways to 
connect the blocks) requires the modified form and so we get: 



F(z,u,-1) = 



zu^(uz^ -I- 2z — uz -I- 1) 



(1 + z){uz'^ + z — uz + 1) 
The corresponding ordinary generating function is 

z{uz'^ + 2z — uz + 1) 



Q{z) = r 

Jo 



IQ {I + z){uz'^ + Z - UZ + 1) 

The quantity Q(z) can be expressed in terms of the exponential integral 



e~^u^du = k\ 



in the following closed form 



Q{z) := 



(^ - 1 + Ei ( 1 , ^)) z 



(z-l)(l + z) 



Since one deals with ordinary generating functions, this is to be taken as a for- 
mal (asymptotic) series. Note also that the exponential integral (Ei) involves the 
divergent series of factorials which is also a hypergeome- 

tric series. This gives rise to a general conversion procedure from exponential 
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integrals to hypergeometric forms. Hence, another closed form for the OGF of 
the Qn is 



Q{z) 



z(z + zhypergeom(^[l, !],[],- 

(1 + 0)2 



Thus, with F{z,u,v) = and for OGF Q(z) = '^QnZ^, by 

recalling that Q(0) = <()(F( 0 , u, — 1)) we get the expression for Q{z) as stated in 
the Theorem. The expression can then be expanded using the binomial theorem, 
and double combinatorial sums result for coefficients. □ 



Though they have no direct bearing on the graph problem at hand, we men- 
tion two interesting consequences of this theorem. 

Corollary 1. The quantities Qn satisfy the recurrence 

(n + l)Qn + Qn+i — 2nQ„+2 + 4Q„+3 + (n + 3)Qn+4 — Qn+5 = Oj 



where Q(0) = 0, <5(1) 
estimate 



1, Q(2) = g(3) 

Qn —2 



= Q(4) 

(^1 + 0 




0 and the asymptotic 



Proof. To get the recurrence relation, we use the following holonomic descripti- 
ons (introduced by Zeilberger), that is sequences that satisfy linear recurrences 
with polynomial coefficients: 



{z^ + z^ + 4z^ - 1- z + 4z^) Y( 0 ) + (- 20 ^‘ + 0 ^ 




-0^ - 2z^_Co - 2z^ - 04_Co + -Coz^ -z^ + z-z^ = 0 
where +( 0 ) = Q{z). By putting Cq = 1, we get: 



(0"‘+0®+40^ 



1-0+402)Y(0) + (-204 + .^2_^_^6) 



(^Y(0)^ -2z‘^-4z^-z^+z = 0 



We can now get (by elementary properties of the 0 -transform) the following 
transformation to a linear recurrence: 



u{0) = 0, m(1) = 1, u{2) = 0, u(3) = 0, m( 4) = 0, u(5) = 0, 

(n+ l)u{n) +u{n+ 1) — 2nu{n + 2) +4u{n + 3) + (n + 3)w(n + 4) — u{n + 5) = 0 

We note that this provides an algorithm that uses a linear number of arithmetic 
operations to determine the quantities Qn- By using the following principle based 
on the generating function method: 



coe/Z^r. hypergeom ([1, 1], [ ], 0 + d02 + O ( 0 ^)) = n! e'^(l + o(l)) 
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provided that the argument of the hypergeometric is a function that is analytic 
at the origin, we have proved 



Q{z) 



z(2: + zhypergeom(^[l,l],[],-^^^) + l) 

(1 + ^)2 



and since 

^ _ 2^2 + 2z^ - + 2z^ + O(z^) 

1 + z ^ ^ 

we have proved that the asymptotic proportion of legal permutations is exactly 
equal to e“2. □ 



The recurrence above implies the non-obvious fact that the number of avoi- 
ding permutations Qn are computable in linear time. The asymptotic estimate 
extends properties known for permutations with excluded patterns (e.g., der- 
angements have asymptotic density e~^). Consequently, a nonzero proportion 
(about 13.53%) of all permutations that start with 1 and end with n are avoi- 
ding. 



2.3 Avoiding Paths 

We consider now the problem of counting the number Qnj of avoiding paths of 
type (n,j), where n is the size (the number of nodes) and j is the number of 
“outer nodes” . Such avoiding paths are defined by the fact that they satisfy the 
basic constraints of avoiding permutations regarding the base line (1,2, ... ,n), 
but contain in addition j outer nodes taken to be indistinguishable (unlabelled) 
and conventionally represented by the symbol For instance, for types (n,j) = 
(3, 1), (4, 1), (4, 2), the listings are respectively 

{[1,*,3]} {[1,3,*,4],[1,*,2,4]} {[1,*,*,4]} 



Theorem 2. The number of avoiding paths is expressible as 

n—j n—j — k 2 

Qn+2.j=Y^ - kl - k2)\ 

k2=0 ki=0 

fn — j + 1 — ki\ f n — ki — k 2 \ ^ 

A k2 ; V i ) 

( where j > 0) 

Proof. We first define templates on which an inclusion-exclusion argument is 
applied. The specifications are a simple modification of the templates associated 
to avoiding permutations. 

Let {a, b, a;} be a ternary alphabet. We now define the grammar of templates. 



n — j — k\ — k2 
ki 
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The collection of strings beginning with a and containing only one x that 
occurs at the end is described by the rule: 

spO := S = Prod(Sequence(Prod(a, Sequence (&))), x) 

(It suffices to decompose according to each occurrence of the letter a). We first 
need so-called “outer points” that are taken from outer space: 

Outerpoints := Sequence(Prod(Z, mu-outerpoint)) 

We also need “inner points”: 

Innerpoints := Sequence(Prod(Z, muJnnerpoint)) 

Size is defined as the cumulative number of points in the pair of paths that 
underlies an avoiding path in the sense above: it is thus equal to the length 
of the avoiding path plus the number of * symbols corresponding to the outer 
nodes. We thus introduce a special notation for nodes of the integer line that 
are shared by the two paths: 



Z2 ■= Prod(Z, Z) 

Now, the three types of blocks are described by the following rules: 

spl := Y’voA{muJ)lock, Z2, Outerpoints, Innerpoints) 

sp2 := Prod{mu-block, Z2, Sequence(Prod{muJength, Z2), card > 1), 

Outerpoints, Innerpoints) 

sp3 ~ Prod(Sequence(Prod(mtt_/en 5 t/i, Z2), card > 1), Z2, Outerpoints, 

Innerpoints, muMock) 

(Clearly, sp2 and sp3 are combinatorially isomorphic). The blocks that can 
occur at the end are of type x and can only be of type spl or sp2 but without 
outer points nor inner points. 

splx := Vrod{muJ}lock, Z2) 

sp2x := Vrod{muJ}lock, Z2, Sequence(Prod(mM_Zenpt/i, Z2), 1 < card)) 

The above grammar is completed (to give S) by substituting into spO 

a = Union(spl, sp2) 
b = sp3 and 
X = Union(spla;, sp2x) 

The 5-variate GF immediately results from the above specification: 



¥)z,U,V,Wi,W2) ■■ = 



— — 1 + 21D2 z in wi 1V2 V — V z W2 — V z"^ ini + v z^ in^ W2 u z'^ v) 

1 — z(w2 -h + z^ (wi W2 — u) -t- V z^ ( — 2 -t~ 2 z w I — 2 z^ wi W2 -h (1 — z W2 — z + z^ ini W2 -h z^ u)) 
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where u, v, wi, W 2 represent the blocks, the length, the outer nodes and the inner 
nodes, respectively. 

For inclusion-exclusion, we set u = — 1. Application of the (/)-transformation 
(that counts the number of ways to connect the blocks) requires the modified 
form 



F{z,u,-1,wi,W2) := 

14 ^ 2^(1 + 22^ — 2^n;j^ -\- u w-^ W2 — w 2 — 142 ^ — z W2 — 2it)j^ + 2^u;j^u;2) 

(1 + 2^)(2^ lUJ^ W2 + u z^ — Z^ W2 — z^ Wl + 2^ VJ1 W2 + z^ — 14 2^ — 2 W2 — 2 Ut 1 + 1) 

The ordinary generating function is here 



«(*) >= 



2^(1 + 22^ — 2^ tui + 14 2^ + z'^ W1 W2 ~ 2^ 1D2 — 14 2^ — 2 W2 ~ 2 ILI + 2^ lUJ^ ‘^2^ ® ^ 
(1 + 2^)(2^ W1 W2 + U Z^ — 2^ 1412 — 2^ xu-^ + z'^ 1412 +2^ — 14 2 ^ — 2 1L>2 — 2 14’! + 1) 



And this can be expressed in terms of the exponential integral as follows: 



(1 + 2 -=)( h ; 2 + 14 ; i ) 
Q(2) := 2^ - e 2(2-l)(l + 2) 



Ei ^1 



ij^+2'^ 1412+2'^— 2 ii;2— 2 ii;j^ + 

22(22_i) 



0 



(l + 2^)(2^ Wl 1U2 + 1) 
, 22(1-22) 



_ (1 + 2 -^)( i 4 ’ 2 +'»; i ) 
z(z-l)(l + z) 



(22 _ 1)(1 + 22 ) 

(l + 2^)(i»2+t4;i) (l + 2^)(it»2 + i»i) 

; 2 ( 1 - 22 ) _ ^ 2 ( 1 - z 2 ) 



(22 - 1)(1 + 22) 

Again, there is an “explicit form” of the OGF of the problem 



Q(z) := 2 



^(2i4ij^ it;2 — 1^2 — ^l) + 22 hypergeom j [1 ,!],[],— 



22 (^-1)(1 + 



(l + 22)(2 W2 - 



(1 + 2 ) \ 

1)(2 ii;i -1) J 



(1 + z-^)‘^(z W2 - 1)(2 lui - 1) 

, 2^(1 + Wl W2) — z(w2 + + 1 

(1 + 22)2(2 ^2 — l)(2u;j^ — 1) 



and also 



Q{z) := 



z* hypergeom ([1, !],[],- 

(1 + Z'^)'^{ZW2 — 1){ZWI — 1) 



1 + + 



The coefficient c{n,j, k) of 2 " Wi^ W 2 ^ is obtained by straight expansion and 
avoiding paths are then enumerated by C{n,j) = c{n,j,j). The corresponding 
formulae of the Theorem statement are obtained directly by symbolic expansions. 

The computations are rather intensive and, for instance, the 4-variable GF 
that “lifts” F{z,u, —1) is found to be 

14 2^ I 1 — 2 11)2 ~ 2 : 14>1 + 2^ Wl 11)2 + 2^ — 2^ 11)2 ~ 2^ Wl + 2^ Wl 14)2 + U Z^ J 

r (5) 

(1 + 22) ^^4 Wl W2 + u - z^ W2 - z^ Wl + Wl W2 + z'^ - u z'^ - z W2 - z Wl + Ij 

It is to be noted that computations have been performed with the help of the 
computer algebra packages Combstruct and Gfun that are dedicated to auto- 
mating computations in combinatorial analysis and have been developed in the 
Maple system for symbolic computation. □ 
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3 Average-Case Analysis for the Random Graph Model 

We show now how to estimate the robustness to link failures in a random graph 
that obeys the Gn,p model. An avoiding pair of length ^ in a graph is an unordered 
pair of paths, each of length with a common source and a common destination, 
that may share some nodes, but are edge disjoint. We have: 

Theorem 3. The mean number of avoiding pairs of length i between a random 
source and a random destination in a random graph obeying the Gn^p model is 

2 i ^ / \ 

where the coefficients Qnj are given by Theorem\^ 

Proof. The coefficient 1/2 corresponds to the fact that one takes unordered 
pairs of paths; the coefficient l/(n(n — 1)) averages over all possible sources and 
destinations; the factor p^^ provides the edge weighting corresponding to Gn,p', 
the arrangement numbers account for the number of ways to embed an avoiding 
path into a graph by choosing certain nodes and assigning them in some order to 
an avoiding path; the coefficients Qi+ij provide the basic counting of avoiding 
paths that build up avoiding pairs. □ 

Note 1. Since the model implies isotropy, the quantity Ni{n,p) is also the 
mean number of avoiding pairs between any fixed source and destination s, t. 

Robustness. A short table of initial values of Ni(n,p) follows: 

N 2 = l{n - 2){n - 3)p'^ N 3 = ^{n - 2){n - 3)^(n - 4)p® 

N 4 = i(n — l)(n — 2)(n — 3)(n — 4)(n — 5)^p® 

N 5 = \\n — 2)(n — 3)(n — 4)(n — 5)^(n^ — + 25n + 32)p^° 

From developments in the previous section, the formulae are computable in low 
polynomial time (as a function of £). They make it possible to estimate the mean 
number of avoiding pairs in graphs of a given size for all reasonable values of 
n,p, £. Take for instance a graph with n = 10® nodes and an edge probability 
p = 5 • 10“®. This corresponds to a mean node degree that is extremely close to 
5, so that, on average, each node has 5 neighbors. Then the mean values are 

N 2 = 3.1 • 10"®, As = 7.8 • 10"®, A 4 = 1.9 • 10"®, As = 4.8 • 10“'‘, As = 1.2 ■ 10■^ 
At = 0.30, As = 7.6, Ag = 190, Aio = 4763, An = 119062, An = 2.9 ■ lO'^ 



Thus, in this example, one expects to have short and multiple connections 
between source and destination provided paths of length 8 are allowed. This nu- 
merical example also shows that there are rather sharp transitions. The formula 
of Theorem 01 that entails the following rough approximation 

A,(n,p) « i 



( 6 ) 
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precisely accounts for such a sharpness phenomenon. 

In the introduction, we have defined ^-robustness as multiple connectivity by 
edge-disjoint paths of length at most 1. In fact, Equation 0 leads to explicit 
expressions for generalized avoiding pairs of type that are made of two 

paths, of lengths i\, £ 2 - It can then be seen that the bottleneck for existence of 
pairs (£ 1 ,^ 2 ) with ^ 1,^2 < ^ is in fact the case (£,£)■ Thus, since N({n,p) — >■ 0 
when — , the function 

Pr{n,£) = 

is a cut-off point for £-robustness and an (< £,< £)-avoiding pair is expected or 
not depending on whether p/pr tends to 0 or to 00 . 

Corollary 2. Any fixed pair in a G„,p graph is almost certainly not £ — robust 
I'fp/Prin) 0 . 

Proof. When ^ ^ — >• 0, then the expected number N({n,p) of the desired 

pairs of paths tends to 0 and so does the probability of existence of at least 
one such pair of paths (since this probability, by Markov Inequality, is bounded 
from above by the expectation). Thus, with probability tending to 1, there is no 
pair of edge disjoint paths between the two vertices and these two vertices are, 
almost certainly, not i — robust. □ 



4 Thresholds in the Random Graph Model 

In this section we provide bounds for the probability (and thus the threshold, if 
it exists) of the existence, between any fixed pair of vertices, of two edge-disjoint 
paths of length at most t, by proving the following: 

— We give an estimation of the value pr = PL{n,£) such that Gn,p graphs with 
P if Pl do not satisfy the desired property of the existence, between any 
fixed pair of nodes, of two edge-disjoint paths between some pair of vertices, 
with probability tending to 1 as n goes to infinity. 

— We present a value pu = pu{n,£) such that almost every Gn,p graph with 
p > Pu has almost all its source-destination pairs of vertices connected by 
at least two edge-disjoint paths of length at most £. 



Theorem 4. Define 

Then, for p < PL{n,£), almost surely, there exists a pair of vertices in the Gn,p 
graph that does not have the £ — robustness property. 



Proof. By using the threshold function for diameter. 



□ 



166 



P. Flajolet et al. 



Theorem 5. Define 

Pu{n,t) = 2 (log (n^logn)) ^ 

Then, for p > Pjj{n, i), almost surely, almost all pairs of vertices of a Gn,p graph 
have the £ — robustness property. 

Proof. Consider two independent distributions Gn,pi and Gn,p 2 on the same set 
of vertices. Let Ei{i = 1, 2) be the events “Gn^p^ has diameter 

Consider the graph G obtained when we superimpose an instance G' G Gn,pi 
and an instance G" € Gn,p 2 and OR them (i.e., G has an edge joining u,v iff at 
least one of G', G" has). Clearly G G Gn,p with 

p = pi{l -p 2 ) +P2(1 -Pi) +P 1 P 2 =Pi+P 2 -P 1 P 2 

In fact, if u, v are joined in G' by a path Pi and in G" by a path P 2 , then these 
two paths both exist in G. For p around the threshold for diameter I of Gn,p and 
£ = o(n), the number of pairs u,v of G for which the paths of G', G" overlap is 
o(n^), thus the vast majority of pairs of vertices (n^ — o{n^) of them) in G is 
connected via two edge disjoint paths of length < £. 

This gives approximately & Pu "Ei Pi + P 2 ~ P 1 P 2 and if Pi = P 2 = Pq'^ {po a 
threshold for diameter £ or £+ 1) then 

Pu < < 2(21ogn — log c)^n?“^ 

from P], where c can be adjusted so that the diameter is almost surely £ (see 
P], Corollary 12, p. 237). □ 

5 A Discussion on the Extension to All Pairs 

In this section we show how our results can be used to provide a tighter bound 
for Pu. 

Lemma 1. For every graph G{V,E), if vertices u,v are each connected to a 
specific vertex x & V via two edge disjoint paths each of length £, then u, v are 
connected in G via two edge disjoint paths, each of length at most 3£. 

Proof. For simplicity, let the two (edge disjoint) paths from it to x be colored 
blue and the two (edge disjoint) paths from u to x be colored red. Take one of the 
two red paths and mark the first red-blue intersection vertex xi of it (it always 
exists such a vertex since it is in the worst case xi = x). Now take the other red 
path and mark the first red-blue intersection vertex X 2 (again this vertex can be 
x). There are two cases: 

Case 1: Vertices Xi,X 2 are in different blue paths. Then the Lemma is easily 
proved by simply following the two different blue parts and then continuing 
with the two different red ones. Note that the two blue parts are edge disjoint. 
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the two red continuations are also edge disjoint and there is no red-blue edge. 
Case 2: Both xi^x^ are on the same blue path. Let x\ the closest to u on this 
blue path. Take the first u — v path to be from u (on this blue path) to Xi and 
then from x\ to v (by the same red path which defined xi) and the second u — v 
path be composed by the other red path from v to X 2 , then the blue part from 
X2 to X and then the unused other blue path returning to u. Again, there is 
obviously no edge intersection. 

With respect to length, the worst case is clearly Case 2, where the second 
constructed path has pieces from three of the four initial paths, leading to length 
at most 3£. □ 

Lemma Q yields the following corollary: 

Corollary 3. For every graph G(V, E) if there exists a vertex x G V such that 
for all vertices u,v G V{u,v ^ x) each ofu,v connects to x via two edge disjoint 
paths of length at most I, then the diameter of G is at most 3£ and each u,v G V 
is connected via two edge disjoint paths of length at most 3£. 



Theorem 6. Given Gn,p, ifp{n,i) is such that the probability that two specific 
nodes of G are connected via two edge disjoint paths of length at most I is at 
least 1 — 9 (where 9 = o (^)), then all pairs of nodes u,v of G are each connected 
via two edge disjoint paths of length at most M with probability at least 1 — n9. 

Proof. By applying to the probability of existence of paths between all pairs of 
vertices the fact that the probability of a union of events (existence of paths 
between two specific nodes) is bounded from above by the sum of the probabi- 
lities. □ 

Theorem can provide an upper bound for the all pairs problem, by using 
an upper bound Pu such that for p > pu, for every instance of Gn,p, any fixed 
(or random) pair has the i — robustness property with probability tending to 
1 as n tends to infinity. The derivation of such a bound could be approached 
by the computation of the Second Moment of the i — robustness distribution, a 
computation that seems to be of major technical difficulty, and will be further 
examined in the future. 



6 Conclusions and Further Research 

We estimated here tightly and also asymptotically the mean value of the ways 
to get at least two edge disjoint paths between any two specific nodes of Gn,p 
graphs. We pose as an open problem the calculation of the second moment 
for strengthening the threshold and also the extension of the problem to the 
existence of k edge disjoint paths. 
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Abstract. The unweighted k-edge-connectivity augmentation problem 
(fcECA for short) is defined by ’’Given a n-edge-connected graph G = 
{V,E), hnd an edge set E' of minimum cardinality such that G' = {V,EVJ 
E') is (cr -I- (5)-edge-connected and o+S = fc”, where E' is called a solution 
to the problem. Let fcECA(S,SA) denote fcECA such that both G and 
G' are simple. 

The subject of the present paper is (a + 1)ECA(S,SA) (or fcECA(S,SA) 
with k — a + 1). Let M be any maximum matching of a certain graph 
R{G) whose vertex set Vr consists of vertices representing all leaves of 
G. From M we obtain an edge set Eq, with |j5ol = |A4|, such that each 
edge connects vertices in distinct leaves of G. Let £i be the set of leaves 
to be created by adding E'q to G, and ICi the set of remaining leaves of 
G. 

The main result is to propose two 0(cr^|U| \og{\V\/a) + |E| + |Vfip) time 
algorithms for finding the following solutions: (1) an optimum solution if 
G has at least 2 <t -I- 6 leaves or if |£i| < |/Ci| and G has less than 2cr -|- 6 
leaves; (2) a |-approximate solution if |/li| > |/Ci| and G has less than 
2(7 + 6 leaves. 



1 Introduction 

The unweighted k-edge-connectivity augmentation problem (fcECA for short) is 
described as follows: ’’Given a a-edge-connected graph G = (V,E), find an edge 
set E' of minimum cardinality such that G' = (V, E U E') is (ct + (5)-edge- 
connected and a -\- S = fc.” We often denote G' as G -\- E', and E' is called 
a solution to the problem. Let fcECA(*,**) denote fcECA with the following 
restriction (i) and (ii) on G and E\ respectively: (i) * is set to S' if G is required 
to be simple, and * is left to mean that G may be a multiple graph; (ii) 
is set to MA if creation of new multiple edges in constructing G' is allowed, 
and is set to SA otherwise. In fcECA(*,SA), if G is simple then so is G', or if 
G has multiple edges then any multiple edge of G' exists in G. As for fcECA, 

J. van Leeuwen et al. (Eds.): IFIP TCS2000, LNCS 1872, pp. 169-^^^ 2000. 
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fcECA(*,MA) has mainly been discussed so far. See |d 11171811211 dl2(121l22l2dj for 
the results. It is natural for us to assume that \V\ > ct + 2 in (cr+ 1)ECA(S,SA): 
in {a + 1)ECA(*,SA), we may have |E| < cr + 1. 

As related results, fcECA(S,SA) for G having no edges was first discussed in 
0, where the problem that is more general than /cECA(S,SA) is considered. An 
0(|F| + |E|) algorithm for 2ECA(S,SA) can be obtained by slightly modifying the 
one given in |2| for 2ECA(*,MA). As for 3ECA(*,SA), proposed an 0{\V\ + 
|A|) algorithm for 3ECA(*,MA), and showed that if \V\ > 4 then this algorithm 
finds an optimum solution to 3ECA(*,SA). Concerning {a + 1)ECA(S,SA) with 
\V\ > cr + 2 for (7 G {3,4}, proposed an 0(|E|log|E| + \E\) algorithm. 
Other related results have been reported in PITH] . T. Jordan showed in m 
that /cECA(S,SA) is NP-hard in general, and [2j proposed an 0(|E|^) algorithm 
for fcECA(S,SA) for any fixed k. 

The subject of the present paper is (cr + 1)ECA(S,SA), that is, fcECA(S,SA) 
with k = a +1. Let M be any maximum matching of the leaf-graph R(G) whose 
vertex set Vr consists of vertices representing all leaves of G. (The definition of 
R{G) is going to be given later). From A4 we obtain a certain edge set Eq, with 
|Eq| = \M\, such that each edge connects vertices in distinct leaves of G. Let C\ 
be the set of leaves to be created by adding Eq to G, and /Ci the set of remaining 
leaves of G. 

The main result of the paper is to propose two 0(cr^|P| log(|E|/cr) + \E\ + 
|VrP) time algorithms for finding the following solutions for (cr+ 1)ECA(S,SA): 

(1) an optimum solution if G has at least 2cr + 6 leaves or if \C\\ < |/Ci| and G 

has less than 2cr + 6 leaves; 

(2) a |-approximate solution if |£i| > |/Ci| and G has less than 2cr + 6 leaves. 

A central concept in solving fcECA is a t- edge- connected component of G: a 
maximal set of vertices such that G has at least t edge-disjoint paths between 
any pair of vertices in the set |22| . A t-edge-connected component whose degree 
(the number of edges connecting vertices in the set to those outside of it) is equal 
to the edge-connectivity of G is called a leaf Although (cr -|- 1)ECA(S,SA) can 
be solved almost similarly to general fcECA(*,MA), the only difference is that 
the augmenting step has to choose a pair of leaves, each containing a vertex such 
that they are not adjacent in G. (Such a pair of leaves is called a nonadjacent 
pair.) This requires addition of some other characteristics or processes in finding 
solutions by means of structural graphs: a structural graph is introduced in nn, 
and is used as a useful tool that reduces time complexity in finding a solution 
to /fcECA(*,MA) in 

This paper adopts the operation, called edge-interchange, in finding a solu- 
tion, where it was introduced in |2fll21| in order to reduce time complexity of 
Pg. A set of two nonadjacent pairs of leaves is called a D-combination if they 
are disjoint. The augmenting step in solving (cr -|- 1)ECA(S,SA) repeats both 
choosing a nonadjacent pair of leaves and enlarging a (ct -I- 1) -edge-connected 
component by means of edge-interchange (or an analogous operation). Hence 



The (cr + l)-Edge-Connectivity Augmentation Problem 



171 



obtaining an optimum solution requires finding a maximum set of nonadjacent 
pairs of leaves such that any two members in the set form a D-combination and, 
therefore, this is reduced to finding a maximum matching of the leaf-graph R{G) 
of G. The point of {a -I- 1)ECA(S,SA) is that a solution E' is closely related to 
a maximum matching M of R{G). 

The paper is organized as follows. Basic definitions and several basic re- 
sults on cr-edge-connected componets and leaf-graphs are given in Section El 
In Section |3 results on maximum matchings of leaf-graphs are briefly mentio- 
ned. Edge-interchange operation is explained in Section 2] Section 0 discusses 
(cr -I- 1)ECA(S,SA) when G has less than 2cr -|- 6 leaves, and Section 0 considers 
(cr -I- 1)ECA(S,SA) when G has at least 2cr -I- 6 leaves. 

All proofs are omitted becase of space limitation. 

2 Preliminaries 

2.1 Basic Definitions 

Technical terms not specified here can be identified in nnnni. An undirected 
graph G = {V{G),E{G)) consists of a finite and nonempty set of vertices V{G) 
and a finite set of undirected edges E{G), where V{G) and E{G) are often 
denoted as V and E, respectively. An edge e incident upon two vertices it, v in 
G is denoted by e = (it, u) unless any confusion arises. We denote V{e) = {u, it}, 
or generally V{K) = {u,v £ V\{u,v) € K} for a subset K C E. For disjoint 
sets X, X' C V, we denote (X,X';G) = {(u,v) £ E\u £ X and v £ X'}, 
where it is often written as (A, A') if G is clear from the context. We denote 
dciX) = |(A, A;G)|. This is called the degree of A (in G). We set dciS) = 0 
if S' = 0. If A = {i;} then dcdi;}) is denoted simply as dciv) and is the total 
number of edges {v,v'), v' yf v, incident upon v. We often denote dciS) as d{S) 
if G is clear from the context. A path between vertices it and v is often called a 
(it, v)-path and denoted by Pg(u, v), and is often written as P(it, v) if G is clear 
from the context. For two vertices it, v of G, let \{u,v;G), or simply A(u,v), 
denote the maximum number of pairwise edge-disjoint paths between it and v. 

For a set A C V, let G[A] denote the subgraph having A as its vertex set 
and {(it. It) £ F|it, i; £ X} as its edge set. G[A] is called the subgraph of G 
induced by A (or the induced subgraph of G by A). Deletion of A C F from G 
is to construct G\V — A], which is often denoted as G — A. If A = {u} then we 
often denote G — v for simplicity. Deletion oi Q G E from G defines a spanning 
subgraph of G, denoted by G — Q, having E — Q as its edge set. If Q = (ej then 
we denote G — e. For a set E' of edges such that A' n if = 0, let G -I- E' denote 
the graph (V, E U E'). If E' = (ej then we denote G -I- e. 

Let AT C if be any minimal set such that G — K has more components than 
G. K is called a separator of G, or in particular a (A, A)-separator if any vertex 
of A and any one of Y are disconnected inG — AT. IfA = {it|orA = {i’} then it 
is denoted as a (it, F)-separator or a (A, ii)-separator, respectively. A minimum 
(X,Y)~ separator AT of G is a (A, A) -separator of minimum cardinality. Such 
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K is often called an (X, F)-cut or an |i4T|-cut. It is known that a (u,t!)-cut K 
has \K\ = \{u,v,G). A minimum separator of G is a separator of minimum 
cardinality among all separators of G, and \K\ is called the edge-connectivity 
(denoted by a) of G; particularly we call such K C E a, minimum cut (of G). 
G is said to be k-edge-connected if A(G) > k. A k-edge-connected component 
(fc-component, for short) of G is a subset S C V satisfying the following (a) 
and (b): (a) \{u,v;G) > k for any pair u,v € S; (b) A is a maximal set that 
satisfies (a). Let Ec{k) denote the set of all fc-components of G. In a graph G 
with A(G) = cr, a (cr + I)-component S with dc{S) = cr is called a leaf (tr + 1)- 
component of G (or a leaf of G, for short). It is known that A(G) > A: if and only 
if y is a fc-component. Note that distinct ^-components are disjoint sets. Each 
1-component is often called a component. 

Note that we assume that |IA | > cr -I- 2 in (cr -|- 1)ECA(S,SA), the subject of 
the paper. 

A cactus is an undirected connected graph in which any pair of cycles share at 
most one vertex. A structural graph F{G) of G with A(G) = cr is a representation 
of all minimum cuts of G and is introduced in EH- We use the term ’’nodes of 
F(G)” to distinguish them from vertices of G. F{G) is an edge-weighted cactus 
of 0(|y|) nodes and edges such that each tree edge (an edge which is a bridge 
in F{G)) has weight A(G) and each cycle edge (an edge included in any cycle) 
has weight A(G)/2. Let F{G) be a structural graph of G. Particularly if cr is odd 
then F{G) is a weighted tree. (Examples of G and F{G) will be given in Figs. E 
and 121) Each vertex in G maps to exactly one node in F{G), and F{G) may 
have some other nodes, call empty nodes, to which no vertices of G are mapped. 
Let e(G) C V{F{G)) denote the set of all empty nodes of F{G). Note that any 
minimum cut of G is represented as either a tree edge or a pair of two cycle 
edges in the same cycle of F{G), and vice versa. Let p: V ^ V{F{G)) — e(G) 
denote this mapping. We use the following notations: p{X) = {p(u)|u € X} for 
X CV, and p~^{Y) = {u G V\p{v) G E} for Y C V{F{G)). p({u}) or ^“^({r’}) 
is written as p(v) or p~^(v), respectively, for notational simplicity. For any cut 
{X, V{F{G)) — X;F{G)), if summation of weights of all edges contained in the 
cut is equal to cr then {p~^{X), V — p~^{X)-,G) is a cr-cut of G. Note that the cut 
of F{G) consists of either one tree edge or a pair of two cycle edges in the same 
cycle of F{G). Conversely, for any cr-cut (A, V — X; G), F{G) has at least one cut 
{Y, V {F (G)) — Y ; G) in which summation of weight of all edges contained in the 
cut is equal to cr, where E is a node set of F{G) such that p(X) = Y — e{G). Each 
(cr -I- l)-component 5 of G is represented as a vertex p{S) G V{F{G)) — e(G) in 
F{G), and, for any vertex v G V{F{G)) — e(G), p~^{v) is a (cr -|- l)-component of 
G. For V G V (F{G)), if summation of weights of all edges that are incident to v in 
F{G) equals to a, then v is called a leaf node (that is a degree-1 vertex in a tree 
or a degree-2 vertex in a cycle). Note that, for any leaf node v, p~^{v) is a leaf of 
G, conversely, for any leaf L of G, p{L) is a leaf node of F{G). It is shown that 
F{G) can be constructed in 0(|y||if|) time 0(cr^|y| log(|y|/CT) -I- |E|) 

time 0. 
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Two edges ei, C2 are said to be independent if and only if V{ei) fl ^(62) = 0, 
and a set Q Q E is called an independent set or a matching of G if and only if any 
pair of edges in Q are independent. An independent set of maximum cardinality 
in G is called a maximum matching of G. 

Proposition 1. For distinct sets X,Y C V of any graph G = (V, E), 

d{X) + d(Y) = d{X-Y) + d{Y-X)+2\{V-XUY,XnY)l (2.1) 

d{X) + d{Y) = d{X n F) + d{X U F) + 2|(A -Y,Y-X)\. (2.2) 

Let [a:] ([xj, respectively) denote the minimum integer no smaller (the ma- 
ximum one no greater) than x. 



2.2 (T-Components and Leaf-Graphs 

Let A(G) = tr > 0. Let Xi, X2 be distinct (cr -I- l)-components of G. The pair 
{Xi,X 2 } are called an adjacent pair (denoted as Xix^ 2 ) if any two vertices 
w € Xi and w' € X2 are adjacent in G, or called a nonadjacent pair (denoted 
as X 1 XX 2 ) otherwise. Let 

Vc = {u|u represents an individual (cr -|- l)-component of G} 

and let S{v) S Fc{a + 1) denote the one represented by u S Vc- Let C(G) = 
(Vc, Ec) be defined by Vc and Ec = {(u, v')\v, v' S Vc and S(v)xS(v')}, and it 
is called the component graphoi G. Let LF(G) = {X G Fc(<J+V)\X isaleafofG} 
and Vr = {u|u represents an individual leaf of G} C Vc- Let F(u) denote the 
leaf (cr -I- l)-component represented by u G Vr- Let R(G) = (Vr,Er) be the 
subgraph of G(G) defined by Er = {(u,u') G Ec\v,v' G Vr and Y (v)xY(v')}, 
and it is called the leaf-graph of G. 

Property 1- R(G) is simple. 

Let Yi, i = 1,2, 3, 4, be distinct leaves of G. A set of two nonadjacent pairs 
{Fi, F2}, {F3, Y 4 } is called a D-comhination if they are disjoint (that is, {Fi, F2}n 
{Fj, Y 4 } = 0). In general, for 2t distinct leaves 1^, z = 1 , . . . , 2t, of G with t > 2, 
a set of t nonadjacent pairs {Fi, F2}, . . . , {F2t_i, F2t} is called a D-set of G if any 
two pairs of the set form a D-combination. Let Fix{F2,F3} denote that both 
Y1XY2 and F1XF3 hold. A D-combination {{Fi, F2}, {F3, F4}} is called an /- 
combination (denoted as {Fi, F2}/{F3, F4}) if either Fix{F3,Fi} or F2x{F3,F4} 
holds. If neither {Fi, F2}/{F3, F4} nor {F3, F4}/{Fi, F2} holds then we denote 
{Fi,F2}^{F3,F4}. 

We first show some basic results on R(G) and leaves of G. 

Proposition 2. Suppose that G is simple- Then either |F| = 1 or |F| > cr -|- 2 
for any Y G LE(G)- 
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Since each leaf V has dc{y) = we obtain the next proposition by Propo- 
sition El 

Proposition 3. Suppose that G is simple. If ^ LFiG) is an adjacent 

pair then |Yi| = |Y 2 | = 1- 



Proposition 4. d^(G)(f) > max{|yR| 




Fig. 1. A simple graph G with A(G) = 
3 and \LF{G)\ = 4. 



(cr -I- 1),0} for any v € Vr. 




Fig. 2. A structural graph F{G) of G 
in Fig. Q where all edge-weights are 3 
and none of them are written. In this 
case leaves Yi in LF(G) of the graph 
G shown in Fig. Q are represented as 
nodes Vi of F{G) for i = 1, . . . , 5: it may 
happen that G has a node to which no 
corresponding leaf of LF(G) exists. 



2.3 Examples 

Let G = (V,E) with \V\ > ct -I- 2 and A(G) = cr be any given simple graph. 
Let OPT{M) or OPT{S) denote the cardinality of an optimum solution to 
(cr -|-1)ECA(*,MA) or to (ct-|- 1)ECA(S,SA) for G, respectively. For ct = 3, we give 
an example such that OPT{S) = OPT{M) + l. For the graph G with \LF{G) \ = 
4 shown Fig. 01 R{G) is given in Fig. 0 The set of edges {(ui, M 3 ), (u 2 , U 4 )} is 
an optimum solution to 4ECA(*,MA), while {(ui, U 3 ), (u 2 , us); (^ 3 : tty)} is an 
optimum solution to 4ECA(S,SA) and, therefore, OPT{S) = 3 = OPT{M) + 1. 

3 Maximum Matchings of Leaf-Graphs 

One of requirements in finding a solution to (ct-|- 1)ECA(S,SA) or (cr -I- 1)ECA(*, 
SA) with CT > 1 is to obtain a largest D-set. Hence, in this section, the cardinality 
of a maximum D-set is investigated by considering a maximum matching M. of 
R{G). 
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Fig. 3. The leaf-graph R(G) of G in Fig.Q 



Let A4 denote any fixed maximum matching of i?(G) in the following discus- 
sion unless otherwise stated, where we assume that A(G) = cr > 1. 

Proposition 5. \M\ satisfies one of the followinq {fT)\\f 3 ][ 

(1) // I VrI > 2cr -h 1 or if a is even and |Vr| = 2cr then \Ai \ = [| Vr|/2J . 

(2) If a is odd and |Fr| = 2cr then 

[\Vr\/2\\-1<\M\<[\Vh\/2\. 

(3) If \Vr\ < 2cr — 1 then 

max{0, min{|FR| - cr, [|Vr|/2J}} < \M\ < L|Vr|/2J. 

Corollary 1. Suppose that |Vr| = 2a and a = 2m + 1. If \M\ = [|Vr|/2J — 1 
then G = (V,E) is a complete bipartite graph with V = X U Y , X (1 Y = 0, 
|X| = |y| = cr and E = {{x,y)\x G X,y G Y}. 

The relationship among G, G(G) and i?(G) shows the following proposition 
concerning |Vr|, \M\ and \E'\ of any optimum solution E' to (ct-|- 1)ECA(S,SA). 

Proposition 6. Let E' he any solution to G in {a+ 1)ECA(S,SA) and A4 be a 
maximum matching of R{G). Then 

\Vr\-\M\<\E'\. (3.1) 

4 Augmentation by Edge-Interchange 

We explain an operation called edge-interchange which was originally introduced 
in for an efficient augmentation. It is also used in [ri4|15ll tipi 711 iSj . Let 

LE{G) = {Yi, . . . ,Yg} {q = |LF(G)|) denote the class of all leaves of G and 
choose yi G Y^ as the representative of 1). Let 

Y(G) = {yi\Yi G LE{G)}, q>2, and r = [g/2]. 

We can easily prove the next proposition. 

Proposition 7. If there is a set E' of edges, each connecting vertices of G, such 
that E' f\E = 0 and V {E') = Y{G) C S for some {a +1)- component S ofG + E', 
then S = V. 

Let Y stand for Y (G) in the rest of the section. 
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4.1 Attachments 

We have dciYi) = a and \{yi,yj;G) = a for any yi,yj G Y {i j). An edge 
set F is called an attachment (for G) if and only if the following |(1)| through |(4)| 
hold: 

(1) V{F) c r, 

(2) Ff^E{G) = %, 

(3) V{e) yf V{e') (Ve, e' GF,e^ e'), and 

(4) if q (= \LF{G)\) is odd then F has at most one pair /, /' such that \V{f) fl 
V{f') \ = 1; or if q is even then F has no such pair. 

Let F be any attachment for G. For each e = {u,v) G F, G + F has a new 
(cr + l)-component, denoted by A{e, G + F), containing V (e). 

We are going to show that we can find a minimum attachment Z{a + 1) = 
{r = \q/ 2 \) such that \(G + Z{<j + 1 )) = cr + 1. Although there 
are two cases: r = 1 and r > 2, we discuss only the latter case in the following. 
(Note that if r = 1 then we immediately obtain the desired attachment F.) 



4.2 Finding a Minimum Attachment 

Suppose that there are an attachment F for G and vertices yij G Y — V{F), 
1 < < 2, where yn, j/12, j/21 are distinct, and if j/22 is equal to one of the 



other three then we assume that 1/22 




= 2/21 (see Fig. We use the following 




(2) 



Fig. 4. The edges e, e' and /i, 1 < i < 4: (1) 1/21 1/22; (2) 1/21 = J/22- 



notations: 



L = G + F, e = (2/11,1/12), e' 



(2/21,2/22) if 2/21 7^ 2/22 
(2/12,2/21) if 2/21 = 2/22, 



A{e) = A(e, L + {e, e'}), A{e') = A(e', L + {e, e'}), 

/i = (2/11, 2/21), /2 = (2/12, 2/22), /s = (l/ii, 2/22), /4 = (2/12 , 2/21 ) , 
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where we set /i = /s and e' = f 2 = fi if t/21 = 2/22, and 

\A{f,,L+{f 3 ,f 4 }) if 3 <z< 4 . 

Note that e, e' , fi ^ E{L), 1 < / < 4 . We have the following two cases. 

Case I: A{e) fl Al(e') = 0 ; Case II: A{e) nAl(e') 0 (that is, A{e) = A{e')). 
For Case I, we are going to show that there are two edges /, /', with V{f) U 
V{f) = V{e)U V{e'), such that 

A{e) U A{e') C A{f, L + {/, /'}) = A{f, L + {/, /'}). 

That is, we can add two edges so that one (cr+ I)-component containing . 4 (e) U 
. 4 (e') may be obtained. Finding and adding such a pair of edges /, /' is called 
edge-interchange (with respect to I^(ei) Uy(e2)). 

Suppose that . 4 (e) fl A{e') = 0 . Note that 2/21 7^ 2/22 in this case. Let K 
be any fixed (. 4 (e), . 4 (e'))-cut of L + {e, e'}, and let 1 < z < 2 , denote 
the two sets of vertices in L + {e, e'} such that Bi U B2 = V , B2 = V — Bi, 
K = (Bi,B 2;L+ {e,e'}), A{e) C Bi and A{e') C B2. |A:| = cr = A(?/i, 2/2; L") 
for any yi G Bi, 1 < i < 2 , where L” denotes L, L + e, L + e' or L + {e, e'}. K 
is a (2/1, 2/2)-cut of L. Suppose that / and /' satisfy either (i) or (ii): 

(i) / = fl, f = /2, or (ii) / = /a, /' = / 4 , 
where {/, /'} fl E{L) = 0 . 

The next proposition shows a property of edge-interchange. 

Proposition 8 . If A{e) n A{e') = A{fi) fl A{f2) = 0 then Aifs) fl A{f4) 0 , 

that is, Aifs) = A{f 4 )- 

Let {/,/'} denote the following pair of edges: 

{e, e'} if A{e) = A(e') (the case with V{e) fl V(e') = 0 is included); 

{/i, /2} if A{e) n A{e') = 0 and A{fi) = A{f2); 

{/3, M if 41 (e) n A{e') = A{h) n A{f2) = 0 . 

Clearly, {/, f'}r\E{L) = 0 . Such a pair /, /' are called an augmenting pair (with 
respect to {2/11,2/12,2/21,2/22}) of L. 

Corollary 2 . Let L' = L -\- {/, /'} for any augmenting pair /, /'. Then L' — f 
has no a -cut separating V{f) from V{f). That is, if L' — f has a a-cut K 
separating a vertex ofV{f) from V{f) then K separates the two vertices of 

vif). 

From Corollary | 2 l other important properties (Proposition IH HTTll of edge- 
interchange are obtained. 
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Fig. 5. The two (cr+l)-components .4(/i, G+{/i, / 2 }) and A{gi,G+{g\, g 2 }) produced 
by two augmenting pairs {/i,/ 2 } and { 51 , 32 }, respectively. 



Proposition 9. Suppose that G has six leaves Yi G LF{G) (1 < i < Q), and 
choose yi G Yi as a representative of each Yi. Suppose that |/i,/2} is an aug- 
menting pair with respect to {yi\l < * < 4} of G. If A{fi, G + |/i, / 2 D is a leaf 
then, for each i G { 1 , 2 }, there is an augmenting pair {51,52} with respect to 
^(/j) U {yS) ye} of G such that A{gi,G + {gi, g 2 }) is not a leaf (see Fig. 

By Proposition El we obtain the following procedure that is a modified 
version of the procedure given in HSl- It finds a sequence of edges Ci , . . . , 

(r = [|LF(G)|/ 2 ] > 1 ) by repeating edge-interchange operation, where hand- 
ling the case with \LF{G)\ = 2 is included. Note that edges with which we are 
concerned are those connecting vertices belonging to distinct leaves. If an edge 
5 connects a vertex in a leaf and another vertex in a leaf Yj {i ^ j) then, for 
simplicity, we say that 5 connects Yi and Yj . 

Procedure FIND.EDGES; 

begin 

1. Gi ^ G; 7T ^ LF{G); i g- 1; E[ G- 0; 

2. while 7T yf 0 do 

begin 

3. if |7 t| = 2 then 

4. fi G- an edge connecting the two leaves of tt; E” g- {fi}] 

5. else if |7 t| < 5 then 

6. Find an augmenting pair E" = {fi, /'} by Propositional 

7. else /* |7 t| > 6 */ 

8. Find an augmenting pair E" = {fi, /'} by Proposition El 

9. ^i+i E'f; Gij-i -4— Gi -|- E'f; tt -4— tt — {P(i;)|t; G V{E”){', i ■<— i -|- 1 

end 

end; 



Proposition 10. Gi+i has a leaf containing A{fi,Gij.\) if and only if \LF(Gi)\ 
= 5 just after the execution of Step[^ in EIND^EDGES. 
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Note that executing StepEI or Step0once can be done in 0{\Vb\) by using a 
structural graph F{G), and we can construct F{G) in 0{a‘^\V\ logd^l/a) + \E\) 
time (see [Z|). The details are omitted here. 

The next proposition holds for the edge set E' produced by FIND^EDGES. 

Proposition 11. Let Z{a + 1) = {ei,...,er} (r = \\LE{G) /2\) he given by 
FIND_EDGES. Then Z(a+1) is a minimum attaehment sueh that \{G') = cr+l, 
where G' = G+Z{a+1). Furthermore the procedure runs in 0{a^\V\ log(|y|/cr) + 
tirne. 

5 (<7’ + 1)ECA(S,SA) for G Having Less Than 2cr + 6 

Leaves 

We denote LF{G) = {Yi\l < i < q} {q = \LF{G)\), Y{G) = {yi,...,yq} and 
Vr = {"^ 1 ) • ■ • ) Vq}, where each yi is represented as Vi in R{G). First we consider 
the case where G has two or three leaves. 

Proposition 12. If q = 2 then the following \(lJ\ or \(2)\ holds. 

(1) If Y 1 XY 2 then \M\ = 1, there are two vertices yi G Yi, i = 1,2, such that 
E' = {(2/1, j/2)} is a solution, and OPT{S) = OPT{M) = 1. 

(2) If Y 1 XY 2 then \M\ = 0, there are three vertices yi G Yi (i = 1,2), x G 
V — {Yi U I 2 ) such that E' = {(yi,x), (y 2 ,x)} is a solution, and OPT{S) = 
2 = OPT{M) + 1. 

Proposition 13. If q = S and there exist two leaves Yi, Y 2 with FiXh '2 then 
\Ai \ = 1, there are distinct edges ei, 62 such that E' = {ei, 62 } is a solution, and 
OPT{S) = OPT{M) = 2. 

Next we consider the remaining case where 3 < g < 2cr + 6. For each e' = 
{x' , y') G M, we can choose two vertices x GY {x'), y GY {y'), and let e = {x, y) 
be an edge, which is not included in E. We fix such an edge e for each F G M., 
and let 

K = {(^= (x, y) I {x' , y') G M}. 

Proposition 14. \E'f\ = \Ai\ and EqC\ E = 1). 

In the rest of this section, we consider the graph G + E'^. First we define two 
sets £1 and /Ci as follows. 

Let Gi = G + E'q and let C\ be the set of new leaves of G\ created by adding 
E'q to G. Clearly |£i| < \M\. Let /Ci = LF{G + £(,) - £1 (C LE{G)). Since 
A4 is a maximum matching of R{G), Proposition 0 shows that each leaf in ICi 
consists of only one vertex and that the set of vertices JC'i = {x \ {x} G /Ci} 
induces a complete graph of G and of G + E'q. 

We are going to propose an 0 ((t^|P| log(|P|/CT) + \E\ + |VrP) time algo- 
rithm such that it finds an optimum solution if |£i| < |/Ci| and such that 
a |-approximate solution if |£i| > |/Ci|. Note that we have |£i| < |/Ci| if 
\M\<[\Vr\/3\. 
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Proposition 15. Let {y'i),{y'2} G K.i (y[ ^ y'2) and Yi,Y2 G £1 (Yi ^ Y2). If 
{ (2/1 ) 2/1)) (2/212/2)} augmenting pair with yi G Yi and 1/2 G I2 then there 

are ys G Yi and 1/4 G Y 2 sueh that {( 2 / 4 ; 2 /i)i (2/3j2/2)} augmenting pair and 

(2/4, 2/1). (//3, 2/2) T'S'ee P/ 5 . 0 ). 





Fig. 6. A situation for Proposition^^ Fig. 7. A(/i,G + {/i,/2}) in the 

proof of Proposition cni 



We obtain the next proposition by Propositions |H| and El 

Proposition 16. Assume that \C\\ > 3 and |/Ci| > 3. Then there exists an 
augmenting pair {/i, /2| such that fi = {yi,y[) ^ EAE'^, /2 = (2/2, y'2) ^ EUE'^, 
{y'2}} (y'l 7^ y'2)’ ^1 distinct sets Yi, Y2 with yi G Yi, y2 G Y2 

and A{fi, G + {fi, /2D «s not a leaf. Furthermore L\ U/Ci — {{y'l}, {y'2}}’ Yi, Y2} 
is the set of all leaves in Gi + {/i, /2|. (See Fig. n> 

Next we are going to discuss the case where |£i| <2 or |/Ci| < 2. 

Proposition 17. Suppose that |£i| < 2 and |£i| < |/Ci|. Then there exists a 
set E'2 = {/i, . . . , /|iCi|| such that A(Gi + E'2) > a + I and £2 (£ U E() = 0 . 

It remains to consider the cases (|£i| > 3 and |/Ci| < 2) and (|£i| < 2 and 
|£i| > |/Ci|), for which the next proposition holds. 

Proposition 18. Suppose that one of the following (l)-(3) holds: (1) |£i| > 3 
and |/Ci| < 2; (2) |£i| = 2 and |/Ci| = 1; (3) |£i| = 2 and |/Ci| = 0. Let 
qi = |£Y(Gi)| and ri = [%! . Then there exists a set E'f = | fi, . . . , fo, I such 
that A(Gi +E'f)>a+1 and E” n (£ U E() = 0 . 

The discussion from Propositions im through El is summarized in the follo- 
wing procedure FIND-EDGES 2 . 
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Procedure FIND_EDGES2; 

begin 

1. Go — G; 7T i — LF{G)', Eq < — p < — 0; 

2. Find an edge set Eq as in Proposition [1^ Gl i — Gq + Eq] 

Determine £i and /Ci; i 1] 

3. while /Ci 0 do 

begin 

4. if \Ci\ >3 and |/Ci| >3 then 

Find an augmenting pair {f,f'} by Proposition 1161 E''^{f,f}] 

5. else if \Ci\ < 2 and \Ci\ < |/Ci| then 

Find an edge set by Proposition El 

6. else 

Find an edge set E" by Proposition CHI 

7. Construct und U E!^'] ^ E^^\ % A — i -h 1 

end; 

8 . if A(Gi) = cr then/* the case with \Ct \ 7 ^ 0 */ 

Find an edge set E" by Propositional ^i+i ^ 
end; 

Proposition 19 . EIND_EDGES2 produces an optimum solution i/|£i| < |/Ci|. 
Proposition 20 . EIND_EDGES2 gives a ^-approximate solution */|£i| > |/Ci|. 

Remark 1. Let M be any maximum matching of R{G). If |A4| < 
then |£i| < |/Ci| and we can find an optimum solution in polynomial time. If 
j ^ |_^| ^ j |£^| ^ Qj, |£^| ^ Since the proof 

of NP-completeness of fcECA(S,SA) in is given for the case with |AI| = 
we consider approximate solutions if |£i| > |/Ci|. 

Theorem 1 . Suppose that |LA(G)| < 2cr + 6. Then FIND_EDGES2 can find an 
optimum solution if |£i| < |/Ci|, or a ^-approximate solution if |£i| > |/Ci|, in 
0(cr^|P| log(|P|/cr) + |A|) time. 

6 (cr + 1)ECA(S,SA) for G Having at Least 2cr + 6 Leaves 

In this case, Proposition in shows that any maximum matching A4 of R{G) 
has |A4| = [ J . First, some basic results on nonadjacent pairs and edge 

interchange operation are going to be given. 

Proposition 21 . Suppose that there are a nonadjacent pair of leaves Yi,Y 2 G 
LF{G) and two vertices yi & Yi, i = 1,2, with (yi,?/ 2 ) ^ E, such that G' = 
^+{( 2 / 112 / 2 )} has a leaf S containing Yi UP 2 - LG C — {Y C S'|P g Fg{<J + 1)}, 
X = riUF 2 anrf^ = UF6LF(G)-mv.}^- ir/ien|(A,Z;G)| <a-l z/|£'| >3. 
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The next proposition can be proved by using Propositon 1211 

Proposition 22 . Suppose cr > 3 and let M' = {{v2i-i,V2i)\l < i < m} C M 
for some m < \Ai\, and put Yj = Y{vj) for each Vj G Vr. 

(1) If \M.'\ > 2 and there are distinct indices i,j with 1 < < m such that 

{y 2 i-i,Y 2 i} l{Y 2 j-i,y 2 j} then (i) and (ii) hold. 

(i) These leaves are partitioned into a D-combination {{L'^, {L3, T4}} 

having four vertices yt G L[, t = 1, 2, 3 , 4 , such that G+{(j/i, j/2), (2/3, 2/4)} 
has a (a + T)-component S containing all L'^, t = 1,2, 3, 4. 

(ii) The {a +1)- component S' 0/G+K2/1, 2/2)} such thatL'fJL '2 Q S' 
is not a leaf. 

( 2 ) If \Ai'\ > [cr/2] + 1 and no such pair of indices as in \( 1 )\ exist then, for 
each (v2i-i,V2i) G A 4 ' , there are vertices 2/21-1 € d2i-i o.nd 2/21 G Y2z such 
that G' = G + {(2/21-1,2/21)} is a simple graph having a {a + l)-component 
X which is not a leaf and which contains l2i-i U l2i- 

Proposition 23 . Suppose that there is a set M' = {{v 2 i-i,V 2 i)\I < i < m} C 
A 4 for some m with cr + 2 < m < \Ai\, and put Yi = Y{vi) for each Vi G Vr. 
Then there is an edge {v2h-i,V2h) G M' with {Yi,Y2}l{Y2h-i,Y2h\ ■ 

By combining Propositions El 123 and 123 we obtain the following proposition. 

Proposition 24 . Suppose that there is a set M.' = {/i = (1121-1, '^'2i)|l < i < 
m} C Xi for some m with cr+3 <m< \Xi\, and put Yi = Y (vi) for each Vi G Vr. 
Then there exists an augmenting pair {e{,e2} with respect to Yi,Y2,Y2j-i,Y2j 
such that G + |e{,e2} is simple and has no leaf S with Yi UI2 UP2j-i UP2j C S, 
where {fi,fj} Q M' . 

Based on Proposition 12 ^ the next procedure FIND-EDGES 3 is obtained. 

Procedure EIND.EDGES 3 ; 

begin 

1. Gi ^ G; IT ^ LF{G)-, i ^ V E'o ^ 

2. while 7T 0 do 

begin 

3. if |7 t| < 3 then 

4. Find an edge set G" as E'in Proposition dn] or usi 

5 . else 

begin /* |7t| > 4 */ 

6. Find a matching Xi” = {( 112 ^- 1 , ii 2 p)|l SiP < m'} of R{Gi), 

where if |7 t| < 2cr + 6 then m' ^ otherwise to' -s— cr + 3; 

7. if |7 t| < 2ct + 6 then 

begin 

Choose E'g C E'^ with |if(| = cr + 3 — to' appropriately; 

Xi' Xi" U {(u,u;) G Er\{v' , w') G E',.,v' G Y{v),w' G Y(i(i)}; 
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/* M' is a matching on R[G) in the case.*/ 

end; 

else 



M! ^ M”\ 

8. Find an augmenting pair E” as in Proposition Ei 

by choosing /i S M" ] /* Note that \M'\ = cr + 3. */ 

9. if fj G ^A' — JA" for of Proposition El then 

begin /* In the case with |7 t| < 2ct + 6 */ 

E[ — {(y2j-ij 2/2 j)}, Gi Gi — {(t/2i-i, 2/2^)}, where 
y 2 j-i S P 2 j_i and y 2 j € V 2 j; 

end; 

10. R'i+i ^ U if"; Gi+i <— Gi + E”; 

7T ^ 7T — € V{E”)}- i ^ z + 1; 

end; 

end; 



Proposition 25. Any set final E[ obtained at the termination of FIND_EDGES3 
is a minimum attachment such that A(G') = cr + 1, where G' = G + if'. 



Theorem 2. If G has at least 2cr + 6 leaves then the algorithm FIND_EDGES3 
correctly finds a solution E' to (cr+ l)ifCA/5',5'A/ for any given G with A(G) = cr 
in 0(cr^|P| log(|F|/cr) + |if| + \Vr\^) time. 

7 Concluding Remarks 

The paper has proposed 

(1) an 0(cr^|P| log(|P|/cr) + |if| + |VrP) time algorithm for finding an optimum 
solution if G has at least 2cr + 6 leaves or if |£i| < |/Ci| and G has less than 
2cr + 6 leaves, 

(2) an 0{a‘^\V\ log(|P|/cr) + |if|) time one for a |-approximate solution if |£i| > 
|/Ci| and G has less than 2cr + 6 leaves. 

We can improve the first algorithm to an 0(cr^|P| log(|P|/cr) + |if|) time one 
by devising how to check whether or not {/i,/ 2 } is an augmenting pair, and 
whether or not A{fi,G + {/i, / 2 }) is a leaf in Proposition 0 
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Abstract. We study hardness of approximating several minimaximal 
and maximinimal NP-optimization problems related to the minimum li- 
near ordering problem (MINLOP). MINLOP is to find a minimum weight 
acyclic tournament in a given arc-weighted complete digraph. MINLOP 
is APX-hard but its unweighted version is polynomial time solvable. We 
prove that, MIN-MAX-SUBDAG problem, which is a generalization of 
MINLOP, and requires to find a minimum cardinality maximal acyclic 
subdigraph of a given digraph, is, however APX-hard. Using results of 
Hastad concerning hardness of approximating independence number of a 
graph we then prove similar results concerning MAX-MIN-VC (respec- 
tively, MAX-MIN-FVS) which requires to find a maximum cardinality 
minimal vertex cover in a given graph (respectively, a maximum cardi- 
nality minimal feedback vertex set in a given digraph). We also prove 
APX-hardness of these and several related problems on various degree 
bounded graphs and digraphs. 

Keywords : NP-optimization problems, Minimaximal and maximini- 
mal NP-optimization problems. Approximation algorithms. Hardness of 
approximation, APX-hardness, L-reduction. 



1 Introduction 

In this paper we deal with hardness of approximating several minimum-maximal 
and maximum-minimal NP-complete optimization problems on graphs as well 
as related maximum/minimum problems. In general, for any given instance x of 
such a problem, it is required to find a minimum (respectively, maximum) weight 
(or, cardinality) maximal (respectively, minimal) feasible solution with respect 
to a partial order on the set of feasible solutions of x. The terminology of mini- 
maximal and maximinimal is apparently first used by Peters et.al. UBI, though 
the concept has received attention of many others, specially in connection with 
many graph problems. For example, minimum chromatic number and its ma- 
ximum version, the achromatic number nm; maximum independent set and 
minimaximal independent set (minimum independent dominating set) | l'3l 1 3j : 

J. van Leeuwen et al. (Eds.): IFIP TCS2000, LNCS 1872, pp. 186-|1^^ 2000. 
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minimum vertex cover and maximinimal vertex cover innni; minimum domina- 
ting set and maximinimal dominating set |TB15| and a recent systematic study 
of minimaximal and maximinimal optimization problems by Manlove m- 

We are led to investigation of several such graph problems while considering 
a generalization of the minimum linear ordering problem (MINLOP). Given a 
complete digraph G„ = {V, An) on a set V = {vi,V 2 , ■ ■ • Vn} of n vertices with 
nonnegative integral arc weights, the MINLOP is to find an acyclic tournament 
m on V with minimum total arc weight. This is a known NP-complete optimiza- 
tion problem Pj and some results about approximation solutions and hardness of 
approximability of MINLOP have been obtained in [2ni. Two problems related 
to MINLOP are the maximum acyclic subdigraph (MAX-SUBDAG) and the mi- 
nimum feedback arc set (MIN-FAS) problems. Given a digraph G = (U, A), the 
MAX-SUBDAG (respectively, MIN-FAS) problem is to find a subset of B C A 
of maximum (respectively, minimum) cardinality such that (V,B) (respectively, 
{V,A — B)) is an acyclic subdigraph (SUBDAG) of G. While MAX-SUBDAG 
is APX-complete and has a trivial ^-approximation algorithm, MIN-FAS is 
not known to be in APX, though it is APX-hard m 

A generalization of MINLOP can be formulated as follows. Note that an 
acyclic tournament on V is indeed a maximal SUB DAG of (i.e., a SUB DAG of 

G„ which is not contained in any SUBDAG of G„). Thus we generalize MINLOP 
as the minimum weight maximal SUBDAG (MIN-W-MAX-SUBDAG) problem 
which requires to find a maximal SUBDAG of minimum total arc weight in 
any given arc weighted digraph (which is not necessarily a complete digraph). 
MIN-W-MAX-SUBDAG is thus APX-hard as its special case MINLOP is so. We 
show that unweighted version (i.e., all arc weights 1) of MIN-W-MAX-SUBDAG, 
called MIN-MAX-SUBDAG, is APX-hard even though MINLOP with constant 
arc weight is solvable in polynomial time. 

The complementary problem of MIN-MAX-SUBDAG is the maximum car- 
dinality minimal feedback arc set (MAX-MIN-FAS) in which it is required to 
find a minimal feedback arc set of maximum cardinality in a given digraph. The 
vertex version of this is the maximum cardinality minimal feedback vertex set 
(MAX-MIN-FVS). An NP-optimization problem related to MAX-MIN-FVS is 
MAX-MIN-VG, in which it is required to find a minimal vertex cover of ma- 
ximum cardinality in a given graph. Another related problem is the minimum 
maximal independent set (MIN-MAX-IS) problem, where one is required to find 
a maximal IS (or an independent dominating set) of minimum cardinality for 
any given graph. 

Since the decision versions of these optimization problems are NP-complete, 
it is not possible to find optimal solutions in polynomial time, unless P=NP. 
So a practical alternative is to find near optimal (or approximate) solutions in 
polynomial time. However, it is not always possible to obtain such solutions 
having desired approximation properties Thus it is of considerable 

theoretical and practical interest to provide some qualitative explanation for this 
by establishing results about hardness of obtaining such approximate solutions. 
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In this paper, we shall establish several results about hardness of approxi- 
mating such problems using the standard technique of reduction of one problem 
to another. Due to restriction on the number of pages, we shall give outlines of 
most of the lengthy proofs, details of which are in ED- The paper is organized 
as follows. In Section 2, we recall the relevant concepts about graphs, digraphs, 
and approximation algorithms. In Section 3, we first prove APX-hardness of 
MIN-MAX-SUBDAG for arbitrary digraph by reducing MAX-SUBDAG to it. 
Then, using the results of Hastad concerning hardness of approximating MAX- 
IS, we prove similar results about MAX-MIN-VG for arbitrary graphs and about 
MAX-MIN-FVS for arbitrary digraphs. In Section 4, we prove APX-hardness of 
MIN-FAS and MAX-SUBDAG for fc-total-regular digraphs, for all /c > 4. Then 
we show that MIN-MAX-SUBDAG is APX-hard for digraphs of maximum total 
degree 12. We also prove that MAX-MIN-VG is A:-approximable for all graphs 
without any isolated vertex and having maximum degree k, k>l, MAX-MIN- 
VG is APX-complete for all graphs of maximum degree 5, and MAX-MIN-FVS 
is APX-hard for all digraphs of maximum total degree 10. In Section 5, we show 
that, MIN-FVS is APX-complete for 6-regular graphs and MAX-MIN-FVS is 
APX-hard for all graphs of maximum degree 9. Finally, in Section 6, we make 
some concluding remarks. 

2 Basic Concepts 

We will denote a graph (i.e. an undirected graph) by G = {V, E) and a digraph 
(i.e. a directed graph) by G = (U, A), where V = {vi,V2, ■ ■ -Vn}, E is the edge 
set and A is the arc set. An edge between vertices Vi and Vj will be denoted 
by {vi,Vj}, whereas an arc from Vi to Vj will be denoted by the ordered pair 
(uj, Vj). In an undirected graph G, degree of a vertex Vi is denoted as d{vi) which 
is the number of edges incident on Vi in G, and G is called k-regular if each 
vertex in G has degree k. In a digraph G, d~^{vi) and d~{vi) are the number of 
arcs in G having Vi as the initial vertex and terminal vertex, respectively, and 
d{vi), the total degree of Vi is defined as d{vi) = d'^{vi) -I- d~{vi). A digraph 
G is k -total-regular if for each vertex d{vi) = k. A path P(vi,vt) in G = 

(V,E) (respectively, dipath in G = (U, A)) is a sequence of distinct vertices 
(vi,V 2 , . . . ,Vt) such that {vi,Vi+i} € E (respectively, {vi,Vi+i) G A) for 1 < j < 
t. A path (respectively, dipath) P{v\,vt) is called a cycle (respectively, dicycle) 
if vi = vt- 

A feedback arc set (FAS) (respectively a directed acyclic subgraph (SUBDAG)) 
in a digraph G = (U, A) is an arc set B C A such that the subdigraph (U, A — B) 
(respectively (U, B)) is acyclic. Given a digraph G = (U, A), a minimal FAS (re- 
spectively maximal SUBDAG) is an FAS (respectively SUBDAG) B C A which 
does not contain (respectively is not contained in) another FAS (respectively 
SUBDAG). Given a graph G = (V,E), G C U is called a vertex cover (VG) 
if for each edge {vi,vj} £ E, C contains either Vi or vj. A VG G is called a 
minimal VC of G if no proper subset of G is also aVGofG.S'CUis called an 
feedback vertex set (FVS) of G if the subgraph/subdigraph G[V — S'] induced by 
the vertex set V — S is acyclic. Similarly a minimal FVS of G is defined. 
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The precise formulation of the problems considered in this paper are as fol- 
lows: 

MAX-SUBDAG (respectively, MIN-FAS) 

Instance - A digraph G = {V,A). 

Solution - A SUBDAG (V, B) (respectively, an FAS B) of G. 

Gost - m{x, B) = \B\. 

Goal - max (respectively, min) . 

MAX-SUBDAG-A: (respectively, MIN-FAS-fc) is the problem of MAX-SUBDAG 
(respectively, MIN-FAS) on fc-total-regular digraphs. 

MIN-W-FVS 

Instance - A pair x = (G,w) where G is a graph/digraph and w assigns a non- 
negative integer to each v € V. 

Solution - An FVS F of G. 

Gost - m{x,F) = 

Goal - min. 

MIN-FVS is the unweighted version of MIN-W-FVS, i.e. MIN-W-FAS with 
w{v) = 1 for each v G V. MIN-FVS-fc is the problem of MIN-FVS on fc-regular 
(respectively fc-total-regular) graphs (respectively digraphs). 

MIN-MAX-SUBDAG (respectively, MAX-MIN-FAS) 

Instance - Same as that of MAX-SUBDAG. 

Solution - A maximal SUBDAG (U, B) (respectively, a minimal FAS B) of G. 
Gost - m{x, B) = \B\. 

Goal - min (respectively, max) . 

MIN-MAX-SUBDAG< fc (respectively, MAX-MIN-FAS< fc) is the problem of 
MIN-MAX-SUBDAG (respectively, MAX-MIN-FAS) on digraphs of total degree 
at most fc. 

MAX-MIN-FVS 

Instance - Same as that of MIN-FVS. 

Solution - A minimal FVS B of G. 

Gost - m{x, B) = \B\. 

Goal - max. 

MAX-MIN-FVS-fc is the problem of MAX-MIN-FVS on fc-regular (respectively 
fc-total-regular) graphs (respectively digraphs) and MAX-MIN-FVS< fc is the 
problem of MAX-MIN-FVS on graphs (respectively digraphs) of degree (respec- 
tively total-degree) at most fc. 

MAX-MIN-VG 

Instance - A graph x = G = {V,E). 

Solution - A minimal VG G of G. 

Gost - m{x, G) = |G|. 

Goal - max. 
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MAX-MIN-VC< k is the problem of MAX-MIN-VC on graphs of degree at most 

k. 

Given an instance x of an NP optimization problem tt and y € sol{x), the 
performance ratio of y with respect to x is defined by RTr{x,y) = max { ’ 

m{x^y ) } "''^here m*{x) is the optimum value. 

A polynomial time algorithm A for an NP optimization problem tt is called an 
e-approximate algorithm for tt for some e > 1 if R-n{x^ ^ c for &ny instance 

X of 7T, where A{x) is the solution for x given by A. The class APX is the set of 
all NP optimization problems which have some e-approximate algorithm. 

An approximation algorithm A for an NP optimization problem tt appro- 
ximates the optimal cost within a factor of f{n) if, for all instances x of tt, it 
produces a solution A{x) in polynomial time such that Rt^{x,A{x)) < f{\x\). 

Among the approximation preserving reductions L-reduction Id is the ea- 
siest one to use. tti is said to be L-reducible to tt 2 H21, in symbols tti <l tt 2, if 
there exist two functions f,g and two positive constants a,/3 such that: 

1. For any x € J,n, f{x) G R 2 is computable in polynomial time. 

2. For any x € I,ri and for any y G so/,r 2 (/(a^))i 9 {x,y) G soGi(a:) is compu- 
table in polynomial time. 

3. m;^(/(x)) <a-ml^{x). 

4. For any x G and for any y G 

\ml^{x) -m^^{x,g{x,y)) \ < (5 ■ \rn%^{f {x)) - m^^{f{x),y)\. 

We shall be using in this paper only the L-reduction though the hardness (or 
completeness) in the class APX is defined in terms of PTAS-reduction {<ptas) 
| |6I2| . An NP optimization problem tt is APX-hard if, for any tt G APX, tt <ptas 
TT, and problem tt is APX-complete if tt is APX-hard and tt G APX. However 
it is well known that IZC3 for any two NP optimization problems tti and tt2, if 
TT\ <L tt 2 and tti G APX, then tti <ptas t^ 2 - 



3 Hardness Results for Arbitrary Graphs/Digraphs 

As already noted, MIN-W-MAX-SUBDAG is APX-hard, we now show that its 
unweighted version MIN-MAX-SUBDAG is also APX-hard, even though the 
unweighted version of MINLOP is solvable in polynomial time. For this it is 
enough to prove the following theorem, as MAX-SUBDAG is APX-complete 

Id- 

Theorem 1. MAX-SUBDAG <p MIN-MAX-SUBDAG with a = 5 and /3 = 1. 

Proof. (Outline) For each instance x = G = {V,A) of MAX-SUBDAG, we 
construct in polynomial time an instance f{x) = G' = {V ,A') of MIN-MAX- 
SUBDAG and with each feasible solution {V , S') of f{x), we associate a feasible 
solution g{S') = S = S' C\ A oi x such that / and g satisfy the conditions of 
L-reduction with a = 5 and /3 = 1. 

Let K = {(vi,Vj)\{vi,Vj) G A and (vj,Vi) ^ A}. For each arc (vi,Vj) G K, we 
introduce a new vertex Vij for the construction of G' . Gonstruct G' = {V , A') as 
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follows: V = VU{vij\{vi,Vj) G ATjand A' = Au{{vj,vi),{vj,vij),{vij,vj)\{vi,vj) 
G K}. For an example, see Figure 1. Let k = \K\ and p be the number of pairs 
of vertices Vi, vj G V such that both {vi, Vj), {vj, vi) G A. Hence, p = 




G 




Fig. 1. A digraph G and the corresponding digraph G' 



It is not difficult to establish the following claims. 

Claim 1 Let {V , S') be a maximal SUBDAG of G' and S = S' DA. Then (U, S) 
is a SUBDAG of G and \S'\ = 3k + 2p- |S'|. 

Claim 2 If (U', 5' ) is a minimum maximal SUBDAG of G', then (V, So) is a 
maximum SUBDAG of G. Also \A\ < 2|S'o|. 

Now IS-;! = 3k + 2p-\So\ < 3{k + p) - |5o| < 6|S'o| - l^ol = 5\So\. Also for 
any maximal SUBDAG {V, S') of G', - IS"! = \S'\ - IS”;!. □ 

Next we prove results about hardness of approximating MAX-MIN-VG and 
MAX-MIN-FVS, using reducibility arguments and the results of Hastad m 
concerning MAX-IS stated bellow. 

Theorem 2. [Hastad] Unless NP=ZPP (respectively P=NP), for any e > 0 
there exists no polynomial time algorithm to approximate MAX-IS within a factor 
of (respectively where n is the number of vertices in an instance. 

Regarding MAX-MIN-VG we have 

Theorem 3. Unless NP = ZPP (respectively P=NP), for any e > 0 there exists 
no polynomial time algorithm to approximate MAX-MIN- VC within a factor of 
(respectively where n is the number of vertices in an instance. 

Proof. (Outline) Given an instance G = (V,E) of MAX-IS, we construct an in- 
stance G' = (U', E') of MAX-MIN-VG, where V = V\J 

and E' = if U {{u, {u, u^}, . . . {u, G V}. In other words, G' is ob- 

tained from G by introducing for each vertex u G V , n -\- 1 additional vertices 
. . . , and adding (n-h 1) additional edges {v, u^}, {u, u^}, . . .{v, 
to the graph G. 

We can establish the following claims without much difficulty. 

Claim 1 A vertex cover S' C V of G' is a minimal VC iff (a) for u G S" fl U, 
v' ^ S", for any 1 < i < n -I- 1, and (b) for u G U — S' , C S' . 
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G=(V, E) 



G’=(V, E’) 



Fig. 2. An instance G of MAX-IS and the corresponding instance G' of MIN-MAX-VC 



From the Claim 1 it follows that for any minimal VC S' C V of G", there 
exists a set S' C V such that S' = (V — S) U 

Claim 2 Let She a maximal IS of G. Then S' = (V— S)U[Ut,g 5 {n^, , w”+^}] 

is a minimal VC of G' . 

Claim 3 Let S' be a minimal VC in G' . If V — S' is not a maximal IS of G, 
then there exists a minimal VC S" of G' such that V — S" is a maximal IS of G 
and moreover, 

(a) |S"| > |S'| 

(b) |V-S"| > |V-S'| and 

(c) |S"|=n(|V-S"| + I). 

Proof. Note that, for any VC S' of G', V fl S' is a VC of G. Hence V — S' = V — 
(V nS') is an independent set of G. Let S' be a minimal VC of G' for which V—S' 
is not a maximal independent set of G. Then we can always extend (V — S') to a 
unique maximal IS S of G (in polynomial time) by introducing vertices of G one 
by one in the order vi,V 2 , ■ ■ ■ ,Vn while mentaining the independence property. 
Hence S D (V — S'). By Claim 2, S" = (V — S) U . . . , is a 

minimal VC of G' and |S"| = n(|S| + 1). Now we show that S = V — S". For 
this first note that S C V as S is a maximal independent set of G. Next, let 
u G S, then from the definition of S" it follows that u ^ S", so u G V — S". 
Hence S QV — S". Also, ii u gV — S", then u ^ S", i.e. u ^ V — S, so u G S. 
Hence S 3 V — S". Thus S = V — S". From this it follows that V — S" is a 
maximal independent set of G. 

From Claim 1, we have |S'| = | V fl S'| + (n + 1)| V — (V fl S')| = n(n + 1) — 
n\V nS'l = n + n\V — S'| = n(\V — S'| + 1). Since |S| > \V — S"|, it follows that 
|S"| > |S'|. Also (b) and (c) follow from the fact that S = V — S". □ 

Claim 4 S C V is a maximum IS of G iff S' = (V— S) U . . . , 

is a maximum minimal VC of G'. 

Proof. Let S be a maximum IS of G. By Claim 2, S' is a minimal VC of G'. 
If S' is not a maximum minimal VC of G', then using Claim 3 there exists a 
minimal VC S" of G' such that |S"| > |S'|, S = V — S" is a maximal IS of G and 
|S"| = n(|S| + I). As |S'| < |S"|, |S'| = n + n|S| and |S"| = n + n|S|, it follows 
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that jS”! < |S'|, which is a contradiction. Hence S' is a maximum cardinality 
minimal VC in G. 

Let S' be a maximum minimal VC of G' . Then by Claim 3, S = V — S' 
is a maximal IS of G and IS"! = n(|5'| + 1). We claim that S' is a maximum 
IS in G. Suppose there exists a maximal IS S* C V of G with |S*| > \S\. 
By Claim 2, S = {V — S*) U [U„gs*{u^, is a minimal VC in G' 

and |S| = n(|S*| + 1). Since |S*| > |S|, it follows that |S| > |S'|, which is a 
contradiction. Hence, S is a maximum IS of G. □ 

Let a(G) denote the independence number and /3(G) denote the size of a 
maximum minimal VC in G. Hence, from Claim 4, we have f3{G') = n(a(G) + l). 
Now let S' be any minimal VC of G'. If V — S' is a maximal IS of G then to S' we 
associate S = V — S' as the feasible solution of MAX-IS for G. If V — S' is not a 
maximal IS of G then let S" be the minimal VC of G' corresponding to S' as in 
Claim 3, so that S = V — S" is a maximal IS of G and |S'| < |S"| = n(|S| + I). 
To this minimal VC S' of G' we associate S as the feasible solution of MAX-IS 
for G. Hence for any minimal VC S' of G' we have 



a{G) 



< 



na{G) 
n|S| 
/3(G') 
|S"| 
/3(G') 
IS" I 
/3(G') 
|S"| 



/3(G') - n /3(G') n 

~ |S"| -n “ |S"| -n “ |S"| - n 
|S"| 1 /3(G') n(|S| + n) 1 

■|S"|-n |S|“ l^"l ■ ^1^1 |5| 



-b 



1 (P{G') 
|S|^ |S"| 
P{G') 

|S"| 



- 1 



1 ) 

(since > 1 and |S| > 1) 



< 2 



/3(G') 

|S"| 



< 2 



PjG') 

|S'| 



Let N be the number of vertices in G'. Since N = + 2n and N < 2ii?, for 



n > 2. Now, for any e > 0, > 



N2 



id-') 



2 2 



id-') 



> • 2i+i > and 



77,2 '^ > Hence by. Theorem El the result follows. 

Regarding MAX-MIN-FVS, we have similar results. 



Theorem 4. Unless NP=ZPP (respectively P=NP), for any e > 0, there exists 
no polynomial time algorithm to approximate MAX-MIN-FVS within a factor of 
^77 2 “'^ (respectively where n is the number of vertices in an instance. 

Proof. (Outline) We prove this by a reduction from MAX-MIN-VC to MAX- 
MIN-FVS as follows. 

Let G = {V,E) be a graph (an instance of MAX-MIN-VC). Construct an 
instance G' = (V', A') of MAX-MIN-FVS from G with V' = Uy^^v{v(,vf} and 
= ['AviGv{(vhvf)}] U ['J{y,^y^}eE{{vf,v^),{v'^,v})}]. In other words, for each 
Vi G V, G' has 2 vertices v(,vf and an arcs (v(,vf). Also for each {vi,Vj} G E 
G' has (u?,uj) and Hence, G' has 2n vertices and n-\- 2m arcs. 

We can easily establish the following claims. 
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Claim 1 For any C CV, 

(1) C is a VC of G iff F’ = {vl\vi G C} is an FVS of G'. 

(2) G is a minimal VC of G iff F is a minimal FVS of G'. 

Claim 2 Let F be any minimal FVS of G'. Then 

(1) for any Vi G V, F (1 {vl,vf} is either empty or singleton. 

(2) for any Vi G V such that F fl {v},vf} ^ (j), F' = F — {vl,vf} + vj is also a 
minimal FVS of G' . 

(3) There is a minimal FVS F' of G' such that |F'| = |F| and F' = {v}\vi G G} 
for some minimal VC G of G such that |G| = |F'|. 

Now let Fo be a maximum minimal FVS of G' and F be any minimal FVS of 
G'. By Claim 2, without loss of generality we can assume that every vertex in Fo 
(respectively, in F) is vj for some Vi G V. Also by Claim 2, Go = {vi\vj G Fo}, 
(respectively, G = {vi\vj G F}) is a maximum minimal VC (respectively, mininal 
VC) of G, and |Go| = |Fo| (respectively, |G| = |F|). Hence 

Let N = |V'|. Then N = 2n. Now \n^-^ = • 2^+^ > 

Hence by. Theorem 0 the result follows. □ 

4 Hardness Results for Bounded Degree Digraphs 

We know that MIN-FAS is APX-hard and MAX-SUBDAG is APX-complete 
UK for general digraphs. In this section, we show that these problems remain 
APX-hard even for fc-total-regular digraphs for all k > A. We also show that 
MIN-MAX-SUBDAG (respectively, MAX-MIN-VC) is APX-hard for digraphs of 
maximum total degree 12 (respectively, graphs of maximum degree 5). Regarding 
MIN-FAS, we first prove the following. 

Lemma 1. MIN-FAS-k <l MIN-FAS-(k -1-1^, for all k> 1. 

Proof. We construct in polynomial time, from a A:-total-regular digraph G = 
(V, A), a, {k + l)-total-regular digraph G' = {V , A') where U' = U where 
V* = G V} for i = 1, 2 and A' = U U F where A® = {(m*, v) G 

A} for i = 1,2 and B — {(u^, G V}. From a minimal FAS S' of G' construct 
a minimal FAS 5 of G as follows: S = {(u, u)|(u^, G where without loss 
of generality we assume that S' = U S^ with and are minimal FASs of 
G^ = {V^,A^) and G^ = (U^, A^) respectively and IF^I < It is easy to see 
that, if S'o is a minimum FAS of G' , then the corresponding So is a minimum 
FAS of G and IS”'! = 2|S'o|. Further, for any minimal FAS 5" = U of 
G', with |Fi| < \S% |5'| - \S'o\ = |F'| + \So\ - 2|Fo| > 2(|5'| - |5„|) so that 
I *5' I — I 'S'o I < Kl'S'l — |S(,|). Thus, the two inequalities of F-reduction hold with 
a = 1 and /3 = ^. □ 

We now have the following. 

Theorem 5. MIN-FAS-k is APX-hard for all fc > 4. 

Proof. (Outline) By Lemma OJ it is enough to show that MIN-FAS-4 is APX- 
hard. For this we show that MIN-VC-3 <l MIN-FAS-4. 
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We construct in polynomial time, from any 3-regular graph G = {V,E) a 
4-total-regular digraph G' = {V ^ A') as defined in the proof of Theorem 0] For 
any FAS F of G", we associate a VC G of G defined as G = {u| either (v^,vA G 
For 

Further, G is a VC of G with |G| < \F\. For every edge {u,v} G F, as 
is a cycle in G', F must contain at least one arc from this 
cycle, and so, G must contain either u or v. Hence, G is a VC of G, and by the 
construction of G from F, |G| < |F|. 

Also, it can be easily shown that if Fq is a minimum FAS of G', then the 
associated VC Go of G is a minimum VC of G and \Fo\ = |Go|, and for any 
FAS F of G', |G| — \Go\ < |F| — |Fo|. So the transformation from G to G' is an 
F-reduction with a = 1 and (3 = 1. □ 

Similarly, for MAX-SUBDAG, we first prove the following. 

Lemma 2. MAX-SUBDAG-k <l MAX-SUBDAG-(k+l). 

Proof. Similar to the proof of Lemma E C 

We now prove the following. 

Theorem 6. MAX-SUBDAG-k is APX-complete for any k > 4. 

Proof. By Lemma 0 it is enough to show that MAX-SUB DAG-4 is APX-hard. 
For this we show that MIN-VC-3 <l MAX-SUBDAG-4 and the reduction given 
in the proof of Theorem 0is in fact an F-reduction from MIN-VC-3 to MAX- 
SUBDAG-4 with 0 = 1 and /3 = 1. □ 

Regarding MIN-MAX-SUBDAG, we have the following easy theorem. 

Theorem 7. MIN-MAX-SUBDAG<12 is APX-hard. 

Proof. In the proof of Theorem^, we constructed an instance G' of MIN-MAX- 
SUBDAG from an instance G of MAX-SUBDAG in such a way that if G is 
4-regular then, every vertex in G' is of total degree at most 12. Since MAX- 
SUBDAG-4 is APX-complete, the result follows. □ 

Next we shall consider MAX-MIN-VC. First we have the following two simple 
lemmas. 

Lemma 3. For any 3-regular graph G = {V, F) and any maximal IS I in G, 

\i\>\\v\. 



Lemma 4. MAX-MIN- VC is k-approximable for graphs of maximum degree k, 
k > 1, and having no isolated vertex. 

Proof. Any minimal VC for such a graph is fc-approximable. □ 

Now we have 



Theorem 8. MAX-MIN-VC<5 is APX-eomplete. 
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Proof. Since MAX-MIN-VC is in class APX for bounded degree graphs (Lemma 
21 and MAX-IS-3 is APX-complete P, it is enough to show that MAX-IS-3 <l 
MAX-MIN-VC<5. 

Let G = (V,E) be a 3-regular graph. From G construct G' = {V',E') of de- 
gree at most 5 as follows: V = u^}] and E' = AU[U^g^{{z;, v^}, {v, 

^ni]- 

By using the arguments given in the proof of Theorem El it can be proved 
that any minimal VC C of G' is of the form C = (V — /) U [U„g/{u^, u^}], for 
some IS / of G where I = V — {G C\V) and \C\ = \I\ + n. Also, Go is a maximum 
minimal VC of G' iff the associated Iq is a maximum IS of G, with |Go| = |/o|-l-n. 

Now, |Go| = |/o| + n< \Io\ +4|/o| = 5|/o| (by LemmaEl), so that, the first 
inequality of L-reduction holds with a = 5. Next, for any minimal VC G of G', 
|Go| — |G| = \Io\ -I- n — |/| — n = |Jo| — |/|, so that, the second inequality of 
L-reduction holds with /3 = 1. □ 

Theorem 9. MAX-MIN-FVS<10 is APX-hard. 

Proof. In the proof of Theorem^, we constructed an instance G' of MAX-MIN- 
FVS from an instance G of MAX-MIN-VC in such a way that if G is of degree 
at most 5, then G' is of total-degree at most 10. Since MAX-MIN-VC< 5 is 
APX-complete it follows that MAX-MIN-FVS<10 is APX-hard. □ 

5 Hardness Results for Bounded Degree Graphs 

In this section we establish APX-hardness of MIN-FVS and MAX-MIN-FVS for 
certain restricted class of undirected graphs. Regarding MIN-FVS, it is known 
that it can be solved in polynomial time for all graphs of maximum degree 3P2I, 
but it is not known whether MIN-FVS is NP-complete for graphs of maximum 
degree 4 or 5. However, it is easy to show that |B| MIN-W-FVS<4 is NP-complete 
and also APX-complete. 

Next we show that MIN-FVS-6 is APX-complete. 

Theorem 10. MIN-FVS-6 is APX-complete. 

Proof. (Outline) As MIN-FVS is in class APX E], it is enough to show that 
MIN-FVS-6 is APX-hard. Towards this we will show that MIN-VC-3 <l MIN- 
FVS-6. 

Let G = (V, E) be a 3-regular graph. From G construct a 6-regular graph 
G' = (V', E') as follows: For every edge {vi, Vj} G E, let Vij = 
vfj.vjj} be the set of seven new vertices and Hij = {Vij,Eij) be the graph ob- 
tained from the complete graph on Vij by removing the edge {vlj,vjj}. Now 
V' = V U i'J{y.^y.}eEVij] and E' = E\J{y.^y.}eE [Eij U {{vi,v}j},{vJj,Vj}}], see 
Figure 3. Clearly G' is 6-regular. 

Let F be an FVS of G'. Then F contains at least 4 vertices from Vij. The 
following claims can be easily established. 
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Fig. 3. An edge {vi,Vj} £ E and corresponding subgraph in G' . 



Claim 1 Let F be any FVS of G' containing exactly 4 vertices from Vij for some 
{vi,Vj} £ E. Then F must contain either Vi or vj. 

To an FVS F of G', we associate the set C of vertices in G defined as 
G = (F n V) U {vi I |F n Vjl > 5 and i < j}. 

Claim 2 G is a VC of G and |F| > |G| + 4|F| = |G| + 6n. 

Proof. If G is not a VC of G, then there exists {vi,Vj} £ E such that Gn{ui, vj} = 
4>. By the definition of G, it follows that |F fl Vij\ < 4 and E fl {vi,vj} = 4>. If 
|F n Vij\ < 4, then E is not an FVS of G', so \EF\Vij\ = 4. By Claim 1, E must 
contain either Vi or Vj. Otherwise F can not be an FVS of G'. This contradicts 
that F n {vi, Vj} = 4>. Hence, G is a VC of G. 

Now |F| = 4|F| + |F n V| + \{vi \ \F 0 Vij\ > 5,i < j}\, as F contains at 

least 4 vertices from Vij for each {vi,Vj} £ E, and for the edges {vi,Vj} £ E 

such that \F nVijl > 5, E contains at least one more vertex from Vij in addition 
to 4 vertices already considered. Hence, |F| > |G| + 4|F| = |G| + fin as G is a 
3-regular and |F| = |n. □ 

Claim 3 For any VC G in G, the set F = G U {Vij,vfj,vfj,vfj}\{vi, Vj} £ E} is 
an FVS of G' such that G = F C\V and |F| = |G| -b fin. 

Claim 4 If Fq is a minimum FVS of G', then the associated set Go is a minimum 
VC of G and |F„| = |Go| -bn. 

Now, |Fo| = |Go|-bfin < |Go|-bl2|Go| = 13|Go| (as any VC in a 3-regular graph 
contains at least ^ vertices). Hence, the first inequality of F-reduction holds with 
a = 13. Next, for any FVS F of G', |F| — |Fo| > |G| -bfin — |Go| — fin = |G| — |Go|. 
So the second inequality of F-reduction holds with (3=1. □ 

Next we shall consider MAX-MIN-FVS. Before that we note the following. 

Lemma 5. For any FVS F of a 6-regular graph G = (V,E), |F| > |n. 

Finally, we have. 

Theorem 11. MAX-MIN-FVS<9 is APX-hard. 

Proof. Let G = {V, E) be a fi-regular graph. Construct a graph G' = (V',E') 
of degree at most 9 as follows: V = V \J £ V} and F' = F U 

{(u, u^), (u, u^), (u, u^), |u £ V} (see Figure 0). Let F be any 

minimal FVS of G'. Note that, for any v G V — E, F contains either or both 
and v^. Further, if u £ F fl V, then F fl {u^, v^} = (f>. To F we associate 
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Fig. 4. a vertex v m G and its corresponding neighbors in G' 




C = F C\V , which is clearly an FVS of G. Note that \F\ < \C\ + 2\V — F| = 
\C\+2\V -C\ = 2n-\C\. 

Let Fo be a maximum minimal FVS of G' . Then jF'ol = 2n — \Go\ where 
Go = FoD V. For, if |Fo| < 2n — \Go\, then F = Co U G V — Go} is 

a minimal FVS of G' with |F| = 2n — |Co| > |Fo| contradicting our assumption 
that Fo is a maximum minimal FVS of G' . Also note that Go is a minimum FVS 
of G. 

Now, |Fo| = 2n— \Go\ < 5|Co| — |Go| = 4|Co|, (by previous Lemma). So, the 
first inequality of L-reduction holds with a = 4. Next, for any minimal FVS F 
of G', |Fo| — |F| >2n— |Go| — 2n+ \G\ = |G| — |Go|. So the second inequality of 
L-reduction holds with (3=1. □ 

6 Concluding Remarks 

In this paper we have established hardness results for several NP-optimization 
problems related to MINLOP. These problems are variations or generalizations 
of well-known NP-optimization problems on graphs/digraphs. While for MAX- 
MIN-VC and MAX-MIN-FVS we have established strong results like those of 
Hastad H2! concerning MAX-IS and MAX-CLIQUE, for others we have just 
shown them to be APX-hard. Whether strong results about hardness of appro- 
ximating such problems can be obtained is worth investigating. Despite such 
negative results, efforts may be made to obtain useful positive results giving 
efficient algorithms which may be /(n)-approximate for suitable function /(n). 
Also, we do not have any results about MAX-MIN-FAS problem similar to MAX- 
MIN-FVS. These and other relavent issues concerning these problems are being 
pursued. 

Acknowledgment: The authors thank C. R. Subramanian for a careful reading 
of an earlier draft and the anonymous referees for their comments and criticisms. 
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Abstract In an alternative approach to “characterizing” the graph class 
of visibility graphs of simple polygons, we study the problem of finding 
a maximum clique in the visibility graph of a simple polygon with n 
vertices. We show that this problem is very hard, if the input polygons 
are allowed to contain holes: a gap-preserving reduction from the maxi- 
mum clique problem on general graphs implies that no polynomial time 

1/8 — e 

algorithm can achieve an approximation ratio of — for any e > 0 , 
unless NP — P. To demonstrate that allowing holes in the input poly- 
gons makes a major difference, we propose an 0{n^) algorithm for the 
maximum clique problem on visibility graphs for polygons without holes 
(other 0{n^) algorithms for this problem are already known PEE|). Our 
algorithm also finds the maximum weight clique, if the polygon vertices 
are weighted. 

We then proceed to study the problem of partitioning the vertices of a 
visibility graph of a polygon into a minimum number of cliques. This 
problem is APX-haxd for polygons without holes (i.e., there exists a 
constant 7 > 0 such that no polynomial time algorithm can achieve an 
approximation ratio of 1 + 7 ). We present an approximation algorithm 
for the problem that achieves a logarithmic approximation ratio by it- 
eratively applying the algorithm for finding maximum weighted cliques. 
Finally, we show that the problem of partitioning the vertices of a vis- 
ibility graph of a polygon with holes cannot be approximated with a 

1/14 — O' 

ratio of - — j for any 7 > 0 by proposing a gap-preserving reduction. 

Thus, the presence of holes in the input polygons makes this partitioning 
problem provably harder. 



1 Introduction 

Visibility problems have received considerable attention in the past. On the one 
hand, art gallery problems - such as Minimum Vertex Guard - have been 
studied intensively with respect to both, bounds on descriptional complexity as 
well as computational complexity results. On the other hand, visibility graphs 
continue to draw interest. A simple polygon with( out) holes is given by its ordered 
sequence of vertices on the outer boundary, together with an ordered sequence 

* We gratefully acknowledge the support of this work by the Swiss National Science 
Foundation. 
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of vertices for each hole, if any. Two polygon vertices see each other, iff the 
straight line segment connecting the two vertices does not intersect the exterior 
(or holes) of the polygon. A graph G = {V, E) with vertices Ri, . . . , is a visibil- 
ity graph, iff there exists a simple polygon P (with or without holes) consisting 
of vertices pi, . . . ,pn such that the polygon vertices pi and pj see each other, iff 
(vi,Vj) G E. The visibility graph characterization problem consists of finding a 
set of graph-theoretic properties that exactly define visibility graphs. It is closely 
related to the visibility graph recognition problem, which consists of determining 
if a given graph is a visibility graph. A lot of work has been done on the visibility 
graph characterization problem (see [I bfi ,'ll24j or for a survey), but it still 
is not satisfactorily solved. A different approach to “characterizing” the class of 
visibility graphs is to determine the computational complexity (and in case of 
A^P-hardness the approximability) of classic graph-theoretic problems on visibil- 
ity graphs. Actually, a considerable amount of work has been done that falls in 
the realm of this approach, because many classic graph-theoretic problems have 
a geometric interpretation in the context of visibility graphs. Also, the problem 
Minimum Coloring on Visibility Graph is mentioned as an open problem 
(with respect to its computational complexity) in an open problems list 



Consider, for example, the problem Maximum Independent Set on Vis- 
ibility Graph, in which we are given a simple polygon with n vertices and we 
are to find the maximum independent set in the corresponding visibility graph. 
This problem corresponds to finding a maximum set of polygon vertices that are 
hidden from each other. The problem is therefore also called Maximum Hidden 
Vertex Set. It is known to be VP-hard APV-hard for polygons without 

l-T 

holes and hard to approximate with an approximation ratio of — for all 7 > 0 
for polygons with holes ng. 

The problem Minimum Dominating Set on Visibility Graph corre- 
sponds to finding a minimum set C of polygon vertices such that each polygon 
vertex can be seen from at least one vertex in C. This problem is a variation 
of the well known art gallery problem Minimum Vertex Guard, which asks 
for a minimum number of vertices (guards) of a given polygon such that every 
point in the interior and on the boundary of the polygon can be seen from at 
least one guard. It is easy to see that the inapproximability results as well as 
approximability results for Minimum Vertex Guard carry over to Minimum 
Dominating Set on Visibility Graph, which therefore is APV-hard mi for 
polygons without holes and not approximable with some approximation ratio 
that is logarithmic in the number of polygon vertices for polygons with holes . 
Furthermore it is approximable with a logarithmic ratio Pj. 



In this paper we study the problem Maximum Clique on Visibility Graph 
with(out) Holes, in which we are given a simple polygon with(out) holes with 
n vertices and we are to find the largest clique in the corresponding visibility 
graph. We distinguish two separate problems by allowing holes or not. Note that 
in the case of polygons without holes, this problem corresponds to finding a 
largest (with respect to number of vertices) convex subpolygon of a given poly- 
gon. The geometric interpretation in the case of polygons with holes is unclear. 
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This problem has potential applications in the setting up of antenna networks 
in terrains (see pin) for the relationship of polygons with terrains), where all 
antennas must see each other in order to guarantee optimum connectivity. 

We show that Maximum Clique on Visibility Graph with Holes can- 
not be approximated by any polynomial time algorithm with an approximation 
ratio of — for any e > 0, unless NP = P in Sect. 0 Thus, Maximum 
Clique on Visibility Graph with Holes is almost as hard to approximate 
as clique on general graphs. We propose a gap-preserving reduction (a technique 
introduced in j2j) from Maximum Clique on general graphs to get this result. 

The problem Maximum Clique on Visibility Graph without Holes 
is known to be solvable in time 0{n^) by slightly adopting algorithms [dlblYj 
that were developed to solve different problems (such as finding empty convex 
polygons that are maximum with respect to the number of vertices by connecting 
some of the input points). We propose an additional O(n^) algorithm for this 
problem for polygons without holes in Sect.0 which uses dynamic programming. 
Our method also solves the weighted version of this problem, in which each vertex 
is assigned a weight value and the total weight of all vertices in the clique is to 
be maximized. We will use this weighted version (only with weights 0 and 1) to 
obtain an approximation algorithm for another problem (see Sect.0)Q 

This gap of “solvable in cubic time” vs. “almost as hard to approximate as 
clique” is the most extreme gap ever discovered between the two versions of a 
visibility problem on polygons with vs. without holes. 

The problem Minimum Clique Partition consists of finding a partitioning 
of the vertices of a given graph into a minimum number of disjoint vertex sets, 
each of which must be a clique in the graph. Again, we can define this problem on 
visibility graphs of polygons with or without holes. In the case of polygon without 
holes, this problem is closely related to the problem Minimum Convex Cover 
WITHOUT Holes, which consists of covering a given polygon without holes with a 
minimum number of (possibly overlapping) convex polygons. Minimum Clique 
Partition on Visibility Graphs without Holes is a variant of Minimum 
Convex Cover without Holes, where only the vertices are of interest (not 
the edges or the interior area of the polygon) . 

A careful analysis (presented in |H]) of the reduction that was originally con- 
structed to show the VP-hardness of Minimum Convex Cover 0 reveals that 
Minimum Convex Cover is APA-hard. The analysis can be easily adopted 
to work for Minimum Clique Partition on Visibility Graphs without 
Holes. Therefore, Minimum Clique Partition on Visibility Graphs with- 
out Holes is APV-harc0, i.e. there exists a constant e > 0 such that no poly- 
nomial time approximation algorithm can achieve an approximation ratio of 1-1- e 
for these problems, unless NP = P. In Sect. 0 we propose an approximation 

^ The fact that our O(n^) algorithm solves the weighted version of Minimum Clique 
ON Visibility Graph without Holes, which will be used as a major building 
block for another approximation algorithm, is the main reason for including it in 
this paper, next to the obvious reason of self-containment. 

^ See 0 and [5| for an introduction to the class APX. 
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algorithm for Minimum Clique Partition on Visibility Graphs without 
Holes that iteratively applies the algorithm for the weighted version of Maxi- 
mum Clique on Visibility Graph without Holes and show that it achieves 
a logarithmic approximation ratio. This result sheds some light on the approx- 
imability of Minimum Clique Partition on Visibility Graphs without 
Holes, but it still is not known whether a constant approximation ratio can 
be achieved or whether the logarithmic approximation algorithm presented is 
optimum. 

There seems to be no straightforward geometric interpretation of Maximum 
Clique Partition on Visibility Graph with Holes, but the problem is 
certainly of theoretic interest, as we propose a gap-preserving reduction in Sect. 
Elfrom Maximum Clique Partition on general graphs that shows that Max- 
imum Clique Partition on Visibility Graph with Holes cannot be ap- 

1 / 14 -')' 

proximated with an approximation ratio of - — j — for any 7 > 0. 

This is the first result for a visibility problem that is fVP-hard no matter 
whether holes are allowed or not, where we are able to show that the approxi- 
mation properties are clearly different for the cases of polygons with vs. with- 
out holes: While Maximum Clique Partition on Visibility Graph with 

1/14 — 7 

Holes cannot be approximated with an approximation ratio of - — j — for any 
7 > 0, we have a logarithmic approximation algorithm for Minimum Clique 
Partition on Visibility Graphs without Holes. 

In Sect. El we draw conclusions. 

As for related work other than the previously mentioned, there are several 
surveys on art gallery and visibility problems |2I] |2Sj |2Z|. As for computa- 
tional complexity results. Minimum Convex Cover with(out) Holes can 
be approximated with a logarithmic approximation ratio |E| . The problems Min- 
imum Vertex/Edge/Point Guard, which are guarding problems with dif- 
ferent types of guards, are known to be VP-hard and APV-hard HH for 
polygons without holes, and inapproximable with an approximation ratio loga- 
rithmic in the number of polygon vertices for polygons with holes j0|. Further- 
more, Minimum Vertex/Edge Guard can be approximated with a logarith- 
mic approximation ratio for polygons with and without holes m- 

2 An Inapproximability Result for MAXIMUM Clique on 
Visibility Graph with Holes 



We propose a gap-preserving reduction from the Maximum Clique problem 
to the Maximum Clique on Visibility Graph with Holes problem. The 
technique of gap-preserving reductions [2j maps the promise problem of Maxi- 
mum Clique to the promise problem Maximum Clique on Polygons with 
Holes. Suppose we are given an instance I of the promise problem Maximum 
Clique, i.e., a graph G = (V, E) with n := \V\ and an integer k with 2 < fc < n, 
where e > 0 is arbitrarily small, but fixed. We are promised that the size of a 
maximum clique in the graph G is either at least k or strictly less than ■ It 
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Figure 1. Basic construction: an input graph 



is NP-haid to decide which of these two cases is true, because otherwise, Max- 
imum Clique could be approximated by a polynomial time algorithm with an 
approximation ratio of which cannot be done unless NP = P m- 

The basic idea of the reduction is shown in Figs. Eand0 For each instance 
/ of Maximum Clique, i.e., for each graph G = (V, E) with n := \V\ (as shown 
in an example in Fig.QJ), we construct an instance /' of Maximum Clique ON 
Visibility Graph With Holes, i.e., a polygon with holes (as shown in an 
example in Fig. El- The main polygon is in the shape of a regular 2n-gon with 
vertices named vi and v' for i G {1, . . . , n}. For each vertex pair (ui, ^ if, we 
construct two small triangular holes, one around the intersection point of the 
line segment from Vi to Vj and the line segment from to and one around 
the intersection point of the line segment from Vi to Vj and the line segment 
from Vj to These triangular holes are designed to block the view of vertices 
Vi and Vj that are not supposed to see each other, since they are not connected 
by an edge in the input graph. The detailed, and rather technical construction 
of the holes is described in ini, and we therefore omit it here. 

In order to make the reduction work, we refine the polygon with holes ob- 
tained thus far as follows: 

For each vertex Vi let vf iyf') be the point on the line segment from Vi to 
w'_i (v[) that is closest to point (u') such that the view of vf to v^ {v^ to 
Vj') for all Vj is still blocked by the corresponding two holes, if vertices Vi and 
Vj are not connected in the input graph by an edge. These points are illustrated 
in Fig. El 

For each vertex Vi, we replace the two line segments from v^ to Vi to v^ by a 
convex chain of of — 1 line segments (called the chain ofvi). This is illustrated 
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Figure 2. Basic construction: polygon with holes resulting from the graph in Fig. Q 

in Fig. 13 By the way that we chose points vf and it is ensured that any two 
vertices from chains of Vi and vj see each other, iff (vi,Vj) C E . 

The following two lemmas allow us to prove the main result of this section. Let 
OPT denote the size of an optimum solution of the Maximum Clique instance 
/ and let OPT' denote the size of an optimum solution of the Maximum Clique 
ON Visibility Graph with Holes instance Let e > 0. 

Lemma 1. OPT > k OPT' >m?k 

Proof. If OPT > k, then there exists a clique of size k in I. We obtain a clique 
in /' of size vfk by simply letting all the vertices of the chain of Vi be in the 
solution, if vertex C V is in the clique. □ 

OPT' < ^ + 3n2 



Lemma 2. OPT < 



ni/2-^ 
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Figure 3. Chain of vertex Vi 



Proof. We prove the contraposition:OPT' > OPT > ■ 

Suppose we have a solution of /' with + 3n^ points. Since there are at 

most n(n— 1) holes with 3 vertices each and n additional vertices u', there can be 
at most 3n(n — 1) + n < vertices in the clique that are not part of the chain 

3 7 

of some Vi- Therefore, at least vertices of the clique must be in chains. 

Since a chain consists of only vertices, each chain can contribute at most 
vertices to the clique. Therefore, the number of chains that contain at least one 

point from the solution is at least ■ Since no two vertices of two 

different chains Vi and Vj see each other unless (yi, Vj) S E, we immediately have 
a solution for / with at least vertices by letting Vi be in the clique if at 
least one point of the chain of Vi is in the solution. □ 



Lemmas El and 0 transform the promise problem of Maximum Clique as 
mentioned above into a promise problem of Maximum Clique ON Visibility 
Graph with Holes, where we are promised that an optimum solution contains 
either at least n^k vertices or strictly less than + 3n^ vertices. It is also 

VP-hard to decide, which of the two cases is true, since otherwise, we could solve 
the VP-hard promise problem of Maximum Clique (see |21 for more details on 
the notion of such gap-preserving reductions). Maximum Clique ON Visibility 
Graph with Holes can therefore not be approximated by any polynomial time 
approximation algorithm with an approximation ratio of: 



n^k n^k v?k 

I q.^,2 — 2n^k 

„l/2-e ~r oil „1/2-€ nl/ 2 -e 



n 



1 / 2 -£ 



2 



We now need to express the size |/'| of the Maximum Clique on Visibility 
Graph with Holes instance /' by the size n of the Maximum Clique instance 
I. According to the construction, |/'| > 2n^. We proceed: 
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This completes the proof of our main theorem of this section: 

Theorem 1. Maximum Clique on Visibility Graphs with Holes cannot 
be approximated by any polynomial time algorithm with an approximation ratio 

of - — , where |/'| is the number of vertices in the polygon and where 7 > 0, 

unless NP = P. 

3 An 0{n^) Algorithm for Maximum Clique on Visibility 
Graph without Holes 

Our polynomial time algorithm for Maximum Clique ON Visibility Graph 
WITHOUT Holes uses dynamic programming. 

Suppose we are given a simple polygon P without holes, which consists of 
n vertices ui, . . . in counterclockwise order. We first compute the visibility 
graph G = G{P) of this polygon, which can be done in time 0{\E\), where E is 
the set of edges in G HH. This allows us to answer queries of the form “Does 
vertex Vi see vertex Vj ?” in time 0(1). As we will use a weighted version of this 
problem to find an approximation algorithm for Minimum Clique partition 
ON Visibility Graph without Holes, we introduce a non-negative weight 
Wi for each vertex Vi. We are now to find a clique in G that has a maximum 
total weight. In the following, all operations are modulo n, where applicable. Let 
^i, 3 ,k with i < j <khe the maximum clique (with respect to its weight) among 
all cliques, which consist of vertices Vi, vj and Vk and additional vertices Vj' with 
i < f < j. Let denote the weight of Aij ^. The optimum solution OPT 

is: 

OPT = Ai,jj where are such that ~ max 

Given all Aijj, OPT can be computed in 0{n^) time. A can be considered to 
be a three-dimensional table. It is initialized as follows: 

Ai^i+ij = {vi,Vi+i,Vj},\/i, j, where vertices Uj, Ui+i, Uj all see each other 

This initialization can be done in time 0{n^). The remaining entries of the table 
A are initialized with empty sets and then computed according to Lemma 0 

Lemma 3. Assume vertices Vi, Vj, and Vk see each other. Then, Aij^k = 

Vk, where j' is such that = max\Aij»j\, where the maximum is taken 

over all j" with i < j" < j and where Vjn sees Vi, vj, and Vk- 

Proof. The proof is inductive. Suppose we know that the lemma holds for Aij^k' 
with k' < k. To show that it also holds for Ai j^k, we assume by contradiction that 
there exists a clique P' , which consists of vertices u,, Vj and Vk and additional 
vertices vv with i < V < j and which is strictly heavier than Aij^k (as computed 
in the Lemma). 

Let vi be the vertex in P' that is the neighbor of Vj in P' in clockwise order, 
when we interpret the clique P' as a convex polygon. Now, consider the clique 
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Figure 4. Proof of Lemma 



which is maximum by assumption. Because Vj is the neighboring vertex 
of vi in P' , we have \P'\ < + Wk- We will now argue that vertex Vk can 

be added to the clique and the resulting set of vertices (i.e. Ai^ij U Vk) is 
still a clique. 

Consider Fig. El First, note that vertex vj must lie to the right of the line from 
Vi to Vk, because vertices Vi, Vj and Vk all see each other and because i < j < k. 
Since Vi,vi,Vj,Vk S P' and i < I < j < k and since P' is a clique, vertex vi 
must lie to the right of the line from vertex vi to Vk and to the left of the line 
from Vj to Vk- Now, consider all vertices I" G Aijj that lie between i and I (i.e. 
i < I" < 1) . By definition of Aijj, all these vertices see Vi, vi and Vj. This implies 
that all vertices vi" also see Vk, because any polygon segment blocking the view 
of some vertex vv to Vk would imply the existence of a polygon segment that 
would block the view of vin to either Vi or vi- We have shown that all vertices 
in Aijj also see Vk, therefore Aijj U is a clique as well. 

The polygon Aij^k is among those polygons over which the maximum is taken 
in the Lemma to compute Aij^k- Therefore, | > P', which is a contradiction 

to the assumption that P' is strictly heavier than Aij k- □ 

A trivial implementation of the algorithm thus suggested would have a run- 
ning time of 0{n) for each of the O(n^) table entries, which results in an overall 
running time of 0{n‘^). It is, however, possible to implement the algorithm with 
a total running time of 0{rA). To achieve this, we show how to compute Aij^k 
with f, j fixed and Aijij already computed for i < j' < j, in time 0{n) (for all 
k with j < k < i). This directly leads to an 0{rA) algorithm, since there are 
only 0{ri^) pairs i,j. 

To speed up the algorithm, fix i,j. Then compute all Vk with j < k < i that 
are visible from vt and vj. Let K denote the counterclockwise ordered set of 
all these vertices Vk- Let L denote the clockwise ordered set of vertices vi with 
i <l < j that are visible to both Vi and Vj For each vertex vi G L (working from 
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Vj towards Vi): Determine, which vertices Vk & K are visible from vi. Let k' < k" . 
Note that if vi sees Vk' C K, then it also sees Vk" G K. Let denote the first 
Vk & K that sees vi. It suffices to just “link” Vk^^^ G K to Ai^ij (depending on 
the implementation, a “link” could be an entry in some record field or a pointer). 
Note that as we work our way through L from Vj to Vi, the Vk^^Js get smaller, i.e. 
proceed towards Vj. Thus, determining Vk^i„ can be done in total time 0{\K\) 
for all vi G L (if {\K\ > \L\, otherwise it is 0{\L\)). We now scan through K. 
It Vk G K is “linked” to some we compare the weight of Aij^i with the 

weight of the currently optimum solution. If is greater than the weight of 

the currently optimum solution, we update the currently optimum solution to 
Aij^i. If Vk is not “linked”, we link it to the currently optimum solution. Now, 
set Aij^k to the currently optimum solution with Vk added. We also store 
This scanning through K can be done in time 0{\K\). Thus, the total running 
time to compute Aij k for all k is 0(max{|L|, |iL|}), which is 0(n). 

Let us summarize the result of this section: 

Theorem 2. The weighted version o/Maximum Clique on Visibility Graph 
WITHOUT Holes, where non-negative weights are assigned to the vertices, can 
he solved in time O(n^) using dynamic programming. 

4 An Approximation Algorithm for Minimum Clique 
Partition on Visibility Graph without Holes 

Our approximation algorithm for Minimum Clique Partition on Visbility 
Graph without Holes iteratively applies the polynomial time algorithm for 
the weighted version of Maximum Clique on Visibility Graph without 
Holes. It works as follows for a given polygon P: 

1. Compute the visibility graph G{P) of the polygon P. Let all vertices have 
weight 1. 

2. Find the maximum weighted clique C in G{P) using the algorithm proposed 
in Sect. El Let all vertices Vi G C have weight 0. Add C to the solution S. 

3. Repeat step 2 until there are no vertices with weight 1 left. Return S. 

To obtain a performance guarantee of this algorithm, consider the Minimum 
Set COVEE0 instance I, which has all polygon vertices Vi as elements and the 
vertices of each clique in the visibility graph of the polygon are a set in J. The 
greedy heuristic for Minimum Set Cover, which consists of recursively adding 
to the solution a set, which contains a maximum number of elements not yet 
covered by the solution, achieves an approximation ratio of 1+ln n, where n is the 
number of elements in / UBI Our algorithm works in exactly this way. Note that 
we do not have to compute all the sets of the Minimum Set Cover instance I 
(which would possibly be a number exponential in n) , since it suffices to always 

® Minimum Set Cover consists of finding a minimum number of sets among a given 
collection of sets such that each element of a given universe appears in at least one 
of these sets. 
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compute a set (or clique), which contains a maximum number of vertices not 
yet covered by the solution, which is achieved by reducing the weights of the 
vertices already in the solution to 0. Thus, our algorithm is polynomial. 

Theorem 3. Minimum Clique Partition on Visibility Graph without 
Holes can he approximated with an approximation ratio o/ 1 + In n, where n is 
the number of polygon vertices, by a greedy heuristic. 



5 An Inapproximability Result for Minimum Clique 
Partition on Visibility Graph with Holes 

Minimum Clique Partition on general graphs is equivalent to Minimum 
Graph Coloring 0. It cannot be approximated by any polynomial time algo- 
rithm with an approximation ratio of , where e > 0 and n is the number 

of vertices in the graph We propose a gap-preserving reduction from Min- 
imum Clique Partition on general graphs to Minimum Clique Partition 
ON Visibility Graph with Holes. 

Again, we map the VP-hard promise problem of Minimum Clique Par- 
tition on general graphs, where we are promised that an optimum solution 
consists of either at most k or strictly more than nfl'^~‘^k cliques, to a promise 
problem of Minimum Clique Partition on Visibility Graph with Holes, 
where we are promised that an optimum solution consists of either at most 
fc -I- 3 or strictly more than cliques. We use the same construction as 

used in Sect. o However, we do not need to use the “chains” as introduced 
in Sect. 0 Let OPT {OPT') denote the size of an optimum solution of the 
Maximum Clique Partition (Maximum Clique Partition on Visibility 
Graph with Holes) instance I {!'). Let e > 0. 

Lemma 4. OPT < fc =A OPT' <k + i and OPT > n^l'^-^k OPT' > 

Proof. For the first implication: If OPT < fc, then there exists a solution of size 
fc in I. We obtain a solution in I' of size fc -I- 3 by simply letting all cliques from 
the solution in I be cliques in T and by adding three more cliques. One of these 
consists of all the “bottom” vertices of all holes (i.e. those vertices that lie on 
line segments between points and w' for all i). The holes are constructed 
in such a way that these vertices actually form a clique (see m)- The second 
clique consists of the “top” vertices of all holes. The third clique consists of 
all vertices u'. The construction of the reduction ensures that these additional 
cliques actually are cliques. 

We prove the contraposition of the second implication: A solution for I' can 
be interpreted as a solution for /, where the additional vertices of I' are ignored. 

□ 

We now proceed as in Sect. |2I using the same concepts. Lemma 0 and the 
fact that |/'| > 3n^ allow us to prove: 
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Theorem 4. Maximum Clique Partition on Visibility Graph with 
Holes cannot be approximated by any polynomial time algorithm with an ap- 

proximation ratio of - — , where |/'| is the number of vertices in the polygon 

and where 7 > 0, unless NP = P. 

6 Conclusion 

We have studied the two problems Maximum Clique on Visibility Graph 
and Minimum Clique Partition on Visibility Graph for both polygons 
with and without holes. In the case of polygons without holes, the clique prob- 
lem can be solved in polynomial time and this algorithm can be used in an 
approximation algorithm for the clique partition problem to achieve a logarith- 
mic approximation ratio. The best inapproximability result known for the clique 
partition problem without holes is HPV-hardness, thus the approximability of 
this problem is not yet precisely characterized. 

In the case of polygons with holes, we have shown for both problems in- 
approximability ratios of n'^ for some e > 0, and have thus placed these two 
problems in the corresponding inapproximability class as defined in |2j. 

Our approach of “characterizing” the class of visibility graphs by studying 
classic graph problems for this class has been used before - at least implicitly. 
The computational complexity of the related problem of coloring the vertices 
of a visibility graph with a minimum number of colors is completely unknown 
and an open problem for future research m- Other open problems include, of 
course, determining the exact approximation threshold for Minimum Clique 
Partition on Visibility Graph without Holes. 
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Abstract The capabilities of alternating cellular automata (AGA) to 
accept formal languages are investigated. Several notions of alternation 
in cellular automata have been proposed. Here we study so-called nonuni- 
form AGAs. Our investigations center on space bounded real-time com- 
putations. In particular, we prove that there is no difference in accep- 
tance power regardless of whether one-way or two-way communication 
lines are provided. Moreover, the relations between real-time AGAs and 
deterministic (CA) and nondeterministic (NCA) cellular automata are 
investigated. It is proved that even the real-time AGAs gain exponential 
speed-up against nondeterministic NGAs. Comparing AGAs with deter- 
ministic CAs it is shown that real-time AGAs are strictly more powerful 
than real-time CAs. 



1 Introduction 

Linear arrays of finite automata can be regarded as models for massively parallel 
computers. Mainly they differ in how the automata are interconnected and in 
how the input is supplied. Here we are investigating arrays with two very simple 
interconnection patterns. Each node is connected to its both immediate neigh- 
bors or to its right immediate neighbor only. Correspondingly they are said to 
have two-way or one-way communication lines. The input mode is parallel. At 
initial time each automaton fetches an input symbol. Such arrays are commonly 
called cellular automata. 

Although deterministic, nondeterministic and alternating finite automata 
have the same computing capability there appear to be essential differences 
when they are used to construct deterministic (CA), nondeterministic (NCA) 
and alternating (ACA) cellular automata. (We use the denotation OCA, NOCA 
and AOCA to indicate one-way communication lines.) For example, it is a fa- 
mous open problem whether or not CAs and OCAs have the same computing 
power (L(OCA) =? L(CA)) m but the problem is solved for nondeterministic 
arrays (L(NOCA) = L(NCA)) 0. It is known that the real-time OCA lan- 
guages are properly contained in the linear-time OCA languages (Lj.t(OCA) C 
Lzi(ocA)) mm- But on the other hand, Lrt(NOCA) = Lji(NOCA) has been 
shown in p. Since L;t(NOCA) = Lrt(NCA) (which follows from the closure of 
Lrt(NOCA) under reversal Q and L/t(NOCA) = L^j(NCA)) we have the identity 
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Lrt(NOCA) = Lrt(NCA). For deterministic arrays it holds Lrt(OCA) C Lrt(CA) 

na. 

Altogether there is little known about the properness of the known inclusions. 
The dilemma is emphazised by the open problem whether or not the real-time 
deterministic CA languages are strictly included in the exponential-time nonde- 
terministic CA languages (Lrt(CA) =? L(NCA))! The latter family is identical 
to NSPACE(n) (the context-sensitive languages), whereas the former is charac- 
terizable by one-way two-head alternating finite automata Pj. 

In order to prove a proper superclass that is as small as possible we cannot 
add more time but we can strengthen the single cells and simultaneously reduce 
the time to real-time again. 

Therefore, we consider arrays built by alternating finite automata. In |0j from 
the point of view of time- varying cellular automata first results concerning a re- 
stricted variant of ACAs are shown. In a second work on alternating cellular 
automata m three models are distinguished. In nonuniform ACAs each cell 
computes its next state independently according to the local transition function. 
In uniform ACAs at every time step one deterministic local transition is non- 
deterministically chosen from a finite set of such functions and is applied to all 
the cells. The last notion defines the weak ACAs where only the leftmost cell 
of the array is an alternating automaton; all the others are nondeterministic. In 
m it is shown that nonuniform ACAs are the most powerful of the devices and 
that linear-time weak and uniform ACAs coincide. Some other results deal with 
simulations between alternating Turing machines and ACAs. This topic is also 
the main contribution of m where the simulation results of m are extended 
and some others are shown. 

Our main interest are nonuniform ACAs under real-time restriction. The 
basic notions are defined in the next section. Section 0 is devoted to the ques- 
tion whether or not two-way ACAs are more powerful than one-way AOCAs. 
We prove the answer to be ‘no’. Especially, the equivalence between ACAs and 
AOCAs is shown for all time complexities. A second result in Section El is the 
important technical lemma which states that a specific subclass of ACAs can be 
sped up by a constant factor as long as the time complexity does not fall below 
real-time. For such devices, especially, the equivalence of real-time and linear- 
time follows. In SectionElthe relations between real-time ACAs and deterministic 
and nondeterministic cellular automata are investigated. It is proved that even 
the real-time ACAs gain exponential speed-up against nondeterministic NCAs. 
Comparing ACAs with deterministic CAs it is shown that real-time ACAs are 
strictly more powerful than real-time CAs. Thus, a proper superclass of the real- 
time CA languages is obtained. Since NSPACE(n) is included in ATIME(n^) 
and, on the other hand, Lrt(ACA) will be shown to contain NSPACE(n) and is 
contained in ATIME(n^) either ^I]( we conclude that the real-time ACAs are a 
reasonable model at all. 

The latter result becomes important in so far as it is not known whether one 
of the following inclusions is strict: 



Lrt(CA) C L;t(CA) C L(OCA) C L(CA) C L(NCA) 
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2 Basic Notions 

We denote the rational numbers by Q, the integers by 7Z,, the positive integers 
{1, 2, . . .} by IN, the set IN U {0} by INq and the powerset of a set S by 2‘®. The 
empty word is denoted by £ and the reversal of a word w by . For the length 
of w we write |tf|. 

An alternating cellular automaton is a linear array of identical alternating 
finite automata, sometimes called cells, where each of them is connected to its 
both nearest neighbors (one to the left and one to the right). For our conven- 
ience we identify the cells by positive integers. The state transition of the cells 
depends on the actual state of the cell itself and the actual states of its both 
neighbors. The finite automata work synchronously at discrete time steps. Their 
states are partitioned into existential and universal ones. What makes a, so 
far, nondeterministic computation to an alternating computation is the mode of 
acceptance, which will be defined with respect to the partitioning. More formally: 



Definition 1. 

An alternating cellular automaton (AC A) is a system A, F) where 

1. S is the finite, nonempty set of states which is partitioned into existential 
(Se) and universal (Su) states: S = Se A Su, 

2. # ^ S is the boundary state, 

3. Ac S is the nonempty set of input symbols, 

4- F C S is the set of accepting states, 

5. 6 is the finite, nonempty set of local transition functions which map from 
(S'U{#})^ to S. 

Let M = {S, 6, #, A, F) be an ACA. A configuration of M at some time t > 0 is a 
description of its global state, which is actually a mapping c* : [1, . . . , n] — >■ S' for 
n G IN. The configuration at time 0 is defined by the initial sequence of states. 
For a given input word w = wi ■ ■ ■ w„ € A+ we set CQ^w{i) '■= Wi, 1 < i < n. 
Subsequent configurations are chosen according to the global transition A\ 

Let n G IN be a positive integer and c resp. c' be two configurations defined 
by si, . . . , s„ G S resp. s'^, ...,s'^gS. 

c' G A(c) 3 (5i, . . . , i5„ G 5 : 

— ^l(^, '^1,^2 )? -52 — ^2(^1? '^ 2 , *^3) ; • ■ • ; 1 5 

Thus, A is induced by <5. Observe, that one can equivalently define ACAs by 
requiring just one unique nondeterministic local transition that maps from (S U 
{#})^ to (2‘® \ 0). But with an eye towards later constructions we are requiring 
a finite, nonempty set of deterministic local transitions from which each cell 
nondeterministically chooses one at every time step. Obviously, both definitions 
yield equivalent devices. 

The evolution of A4 is represented by its computation tree. 

The computation tree Tm,w of Ai under input w G A+ is a tree whose nodes 
are labeled by configurations. The root of Tm,w is labeled by co,m- The children 
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of a node labeled by a configuration c are the nodes labeled by the possible 
successor configurations of c. Thus, the node c has exactly |^(c)| children. 

If the state set is a Cartesian product of some smaller sets S = Sq x Si x 
■ ■ ■ X Sr, we will use the notion register for the single parts of a state. The 
concatenation of a specific register of all cells forms a track. 

If the flow of information is restricted to one-way, the resulting device is an 
alternating one-way cellular automaton (AOCA). I.e. the next state of each cell 
depends on the actual state of the cell itself and the state of its immediate neigh- 
bor to the right. Thus, we have information flow from right to left. Accordingly 
acceptance in ACAs and AOCAs is indicated by the leftmost cell of the array: 
A configuration c is accepting iff c(l) G F. 

In order to define accepting computations on input words we need the notion 
of accepting subtrees. 

Definition 2. Let M = {S, S, #, A, F) be an ACA or an AOCA and be 

its computation tree for an input word w G A”, n G IN. A finite subtree T' of 
Tm,w is said to be an accepting subtree iff it fulfills the following conditions: 

1. The root ofT' is the root ofTj^ y,- 

2. Let c be a node in T' . Lf c' G A(c) is a child of c in T' then the set of all chil- 
dren ofc in T' is {c" G A(c) | d'{i) = c'(i) for all 1 < i < n such that c(i) G 

Sej. 

3. The leafs of T' are labeled by accepting configurations. 

From the computational point of view an accepting subtree is built by letting 
all the cells in existential states do their nondeterministic guesses and, subse- 
quently, spawning all possible distinct offspring configurations with respect to 
the cells in universal states. 

Conversely, one could build the subtree by spawning all possible distinct 
offspring configurations with respect to the cells in universal states at first, and 
letting cells in existential states do their guesses in each offspring configuration 
independently. Fortunately, it has been shown m that both methods lead to 
time complexities which differ at most by a constant factor. Moreover, the proofs 
given in the following can easily be adapted to that mode of acceptance such 
that both methods are equivalent in the framework in question. 

Definition 3. Let A4 = (S, 5, #, A, F) be an ACA or an AOCA. 

1. A word w G A+ is accepted by M. if there exists an accepting subtree of 

2. L{M) = {rc G A+ | w is accepted by A4} is the language accepted by M. 

3. Let t : IN — >■ IN, t{n) >n, be a mapping. If for all w G L{Ai) there exists an 
accepting subtree of TM^y, the height of which is less than t(|w|), then L is 
said to be of time complexity t. 

An ACA (AOCA) At is nondeterministic if the state set consists of existential 
states only. An accepting subtree is now a list of configurations which corresponds 
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to a possible computation path of A4. Nondeterministic cellular automata are 
denoted by NCA resp. NOCA. 

An ACA (AOCA) is deterministic if the set 5 of local transition functions 
is a singleton. In these cases the course of computation is unique for a given 
input word w and, thus, the whole computation tree is a list of configurations. 
Deterministic cellular automata are denoted by CA resp. OCA. 

The family of all languages which can be accepted by devices of a type POLY 
with time complexity t is denoted by Lt(POLY). If t equals the identity function 
id{n) := n acceptance is said to be in real-time and we write Lrt(POLY). The 
linear-time languages Ljt(POLY) are defined according to 

Lit (POLY) := U Lfe.,rf(POLY) 
fceQ, k>i 



3 Equivalence of One-Way and Two-Way Information 
Flow and Linear Speed-up 

This section is devoted to the relationship between AC As and AOCAs and the 
speed-up of a restricted version that becomes important in subsequent proofs. 
The main results are that for arbitrary time complexities there is no difference 
in acceptance power between one-way and two-way information flow and the 
possibility to speed up so-called uniformly universal ACAs and AOCAs by a 
constant factor as long as they do not fall below real-time. Especially by the 
latter result we can show the results in the next sections even for real-time 
language families. 

Without loss of generality we may assume that once a cell becomes accept- 
ing it remains in accepting states permanently. Such a behavior is simply im- 
plemented by setting a flag in an additional register that will never be unset. 
Obviously, thereby the accepted language is not affected since if an node labeled 
by an accepting configuration belongs to a finite accepting subtree then there 
exists a finite accepting subtree where the node is a leaf (it is simply constructed 
by omitting all offsprings of that node). 

The next result states that one-way information flow in alternating cellu- 
lar automata is as powerful as two-way information flow. This, on one hand, 
gives us a normalization since for proofs and constructions it is often useful to 
reduce the technical challenge to one-way transitions and, on the other hand, 
indicates the power of alternations since it is well known that deterministic one- 
way languages form a proper subset of the deterministic two-way languages: 

Lrt(OCA) c Lrt(CA) PS!- 

Theorem 4. Let t : IN — >■ IN, t{n) > n, be a mapping. Then 

Lt(AOCA) = Lt(ACA) 

Proof. For structural reasons it suffices to show Li(ACA) C Lt(AOCA). 
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The idea for the simulation of an AC A by an AOCA without any loss of time 
is as follows: A cell of the AOCA ‘knows’ the actual states of itself and of its 
neighbor to the right. Additionally, it guesses the state of its neighbor to the 
left and simulates the two-way transition of the ACA. In order to verify whether 
or not the guesses are correct each cell stores its guessed state and its old state 
in additional registers. After performing a simulation step the verification can 
simply be done by comparing the old state of a cell with the guessed state of its 
neighbor to the right. Thus, the verification is done by the neighbor to the left 
of a cell, respectively. 

Obviously, the guesses of the leftmost cell are not verified. But we can restrict 
the local transition as follows: If the initial state of a cell is existential and its 
guessed left neighbor state is not the border state then it is marked by a 
during the first time step. If the initial state of a cell is universal and its guessed 
left neighbor state is not the border state then it is marked by a ‘+’. The effect 
of these marks is that the cells with a will never and the cells with a ‘+’ will 
always accept. Thus, if the cell is not the leftmost cell this behavior does not 
affect the overall computation result. But if the cell is the leftmost cell only the 
correct guesses are relevant during the remaining computation. 

Moreover, a left border state is guessed by a cell if and only if that cell has 
guessed a left border state at the first time step. Therefore, to guess a left border 
state at every time step is the only way for the leftmost cell to become accepting. 
But exactly in these cases it has simulated the correct behavior of the leftmost 
cell of the two-way ACA. 

Up to now we kept quiet about a crucial point. Whereas the verification itself 
is a deterministic task which can be performed by cells in existential as well as 
in universal states, responding to the result of the verification needs further 
mechanisms. 

We distinguish two cases: If the old state of a cell is an existential one and the 
verification by the left neighboring cell fails then the latter sends an error signal 
to the left that prevents the not marked cells passed through from accepting. 
Therefore, in an accepting subtree there are only nodes labeled by configura- 
tions in which existential cells have guessed right and, hence, have simulated the 
two-way transition correctly. If the verification succeeds no further reaction is 
necessary. 

In the second case the old state of a cell is an universal one. If the verification 
by the left neighboring cell fails it sends an error signal to the left that enforces 
all not marked cells passed through to switch into an accepting state. Again, if 
the verification succeeds no further reaction is necessary. 

What is the effect of these mechanisms: In an accepting subtree in all con- 
figurations with a common predecessor cells that have been existential in the 
predecessor are in the same states, respectively. Due to the first case these cells 
have simulated the two-way transition correctly. Since all siblings (spawned by 
universal states) have to lead to subtrees with accepting leafs but acceptance ac- 
cording to the two-way ACA depends on the configurations with correct guesses 
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only, all configurations with wrong guesses are forced to accept to achieve the 
desired behavior. 

Altogether it follows that the AOCA can simulate the ACA without any loss 
of time. □ 



Corollary 5. Lrt(AOCA) = Lrt(ACA) 

As we have shown extending the information flow from one-way to two-way 
does not lead to more powerful devices. The next lemma states that increasing 
the computation time by a constant factor does not either if we restrict the 
computations to the uniformly universal mode. A corresponding result does not 
hold for deterministic cellular automata. Instead, Lri(OCA) C Lit(OCA) has 
been shown . The relationship is a famous open problem for deterministic 

two-way devices (e.g. EED- 

Uniform AC As have been introduced in |9l 111) . The main difference between 
uniform ACAs and (nonuniform) ACAs is the induction of the global transition. 
Whereas in an ACA at every time step each cell chooses independently one local 
transition, in an uniform ACA at every time step one local transition is chosen 
globally and applied to all the cells: 

Let j\4 = {S, S, #, A, F) be an uniform ACA, n G IN be a positive integer and 
c resp. d be two configurations defined by Si, . . . , S S resp. s[, . . . , s'^ € S . 

c' £ A(c) 3 £ (5 : 

Si, 52), ^2 — S 2 , S3), . . . , #) 

Thus, in a computation tree of an uniform ACA each node has at most |(5| 
successors. Now a whole configuration is labeled universal (existential) if the 
leftmost cell is in an universal (existential) state. An accepting subtree is a finite 
subtree of the computation tree that includes all (one) of the successors of a 
universal (existential) node. As usual all leafs have to be labeled with accepting 
configurations. 

Now we are combining both modes in order to define an intermediate model 
that serves in later proofs as a helpful tool since its time complexity can be 
reduced by a constant factor. We are considering a computation mode that is 
nonuniform for existential and uniform for universal states. It is called uniformly 
universal mode and the corresponding devices are denoted by UUACA and 
UUAOCA. 

For our purposes it is sufficient to consider UUAOCAs M = {S,6,#,A,F) 
which are alternation normalized as follows: 

AC Se and d St G S : 

(V Si £ Se, S2 £ 5e U {#} : <5i(si, S2) £ Su and 

V Si £ Su, S2 £ Su U {#} : (5i(si , S 2 ) £ Se) 

Thus, at every even time step all the cells are in existential and at every odd 
time step all the cells are in universal states. 
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Lemma 6. Let t : IN — >■ IN, t{n) > n, be a mapping and k G Q,, k > 1, be a 
eonstant. For every alternation normalized UUAOCA M that aecepts a language 
L(M) with time complexity k-t there exists an alternation normalized lJUAOC A 
M' with time complexity t such that L(M) = L(M') and vice versa. 

One central point of the proof is that the number of successor configurations 
of an universal configuration in uniformly universal mode is bounded by |(5|. Its 
details can be found in 0. 

Corollary 7. Let t : IN — > IN, t{n) > n, be a mapping and fc £ Q, fc > 1, &e a 
constant. For every alternation normalized UUACA Ai that accepts a language 
L{Ai) with time complexity k ■ t there exists an alternation normalized UUACA 
M' with time complexity t such that L(M) = L(M') and vice versa. 

Proof. The construction of Theorem 01 does not affect the status of the cells 
(i.e. whether they are existential or universal). Therefore, for a given alternation 
normalized UUACA there exists an equivalent alternation normalized UUAOCA 
with the same time complexity. The UUAOCA can be sped up and the resulting 
automaton, trivially, can be transformed into an alternation normalized UUACA 
again. □ 

4 Comparisons with (Non)Deterministic Cellular 
Automata 

It is a famous open problem whether or not the inclusion Lrt(CA) C? L/t(CA) 
is a proper one. Moreover, the seemingly easier problem Lrt(CA) C? L(CA) 
is open, too. The same holds for nondeterministic cellular automata: It is not 
known whether or not the inclusion Lrt(NCA) C? L(NCA) is strict. 

Since L(NCA) coincides with the context-sensitive languages and L(CA) with 
the deterministic context-sensitive languages the properness of the inclusion 
Lrt(CA) C? L(NCA) is also open due to the open problems mentioned and the 
famous Iba-problem (i.e. in our terms L(CA) =? L(NCA)). The open problem 
Lrt(CA) C? Lrt(NCA) stresses the dilemma. Altogether, the following inclusions 
follow for structural reasons but it is not known whether one of them is strict. 

Lrt(CA) C L(OCA) C L(CA) C L(NCA) and 
Lrt(CA) C Lrt(NCA) C L(NCA). 

In the present section we compare real-time ACAs to deterministic and non- 
deterministic cellular automata. The next result shows that adding alternations 
to nondeterministic computations yields enormous speed-ups. 

Theorem 8. L(NCA) C L^t(ACA) 

Proof. Let A4 = {S, 6, #, A, F) be an NCA. Since the number of cells is bounded 
by the length of the input n the number of configurations is bounded by |S'|”. 
Therefore, we can assume that Ai has the exponential time complexity l^l". 
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The following construction is done for [S'! = 2. A generalization is straight- 
forward. The first step is to define an alternation normalized UUACA M! = 
(S", 5', #, A, F') which needs 3 • n time steps to simulate 2” steps of the NCA M. 
For the main part of Ai' (a deterministic track is added later) we set 
S" := AuS'2uS'3, 5 '':=AuS'2, S'^:=S^, 

S' := {<51 ,,, I 1 < < |5|, 1 < fc < |F|} U I 1 < z < 1^1} U 

where SF <5®, <5“ and S 2 will be defined in the following. 

Let n denote the length of the input, S = {<5 i, . . . , < 5 ^}, S = (si, . . . , s^.} and 
A = {/i,...,/4. 

In its first time step Ai' saves its input configuration, guesses an accepting 
configuration of Ai and another configuration of At: 

Vpi,P2 G A,P 3 G AU{#}: 

Slj,kiPlA2,P3) ■■= {p2,Si,Sj) 

sij^ki^’P-^^ps) ■= {P2,si,fk), i< bills'!, l<fc<|i^| 

The idea is to guess successively an accepting computation path cq, . . . , C2» 
of Ai. The configuration on the second track should be C 2 »-i and the accepting 
configuration on the third track should be 02 " . 

From now on at every universal step for each configuration two offsprings 
are spawned. One gets the configurations on the first and second track and the 
other the configurations on the second and third track: 

V (Pl,l,Pl,2,Pl.3), {P3,1,P3,2,P3,3) G U {#}, {p2,l,P2,2,P2,3) G S^: 

4((T1.1>Pi,2,P1.3),(P2,1,P2.2,P2.3),(P3.1,P3.2,P3.3)) := {P2,l,P2,2) 

^2 ((Pl.l>Pl,2,Pl.3),(P2,l,P2.2,P2.3),(P3.1,P3.2,P3.3)) := (P2,2,P2,3) 

Since c{ represents the configurations cq, C2n-i and C2» (on its first, second 
and third track) its both successors represent the configuration pairs (co,C2n-i) 
and (c2n~i,C2"). (The notation (ci,c,) says that the first track contains Ci and 
the second one c, .) 

In every further existential step the configuration between the represented 
configurations is guessed: 

V (pi,l,Pl,2), (P3,1,P3,2) G 52 U {#}, (P2.1,P2.2) G S'^ : 

St{{Pl,l,Pl,2),{P2,l,P2,2),{P3,l,P3,2)) ■= {P2,l , Si, P2,2) , 1<*< IS*! 

Thus, the two possible configurations of Ai' at time step 3 are represent- 
ing the configuration triples (cq, C2«-2, C2.2"-2) and (c2.2"-2,C3.2n-2,C4.2~-2). One 
time step later we have the four pairs (cq, C2>»-2), (c2~-2, C2.2"-2), (c2.2"-2, c^.2^-^) 
and (c 3.2"-2, C4.2*»-2). 

Concluding inductively it is easy to see that at time t = 2 -m, 1 < m < n there 
exist 2* configurations of Ai' representing the pairs (co,C2~-t), (c2 "-t, C 2 . 2 "-*)) 
(c 2-2"--*) <^3-2"-*)) ■ • • ) (C(2t_l).2"-‘) <^2*.2"-*)- 

For t = 2 • n we obtain (cq, ci), (ci, C2), . . . , (c2"-i, C2"). Now Ai' can locally 
check whether the second element of a pair is a valid successor of the first ele- 
ments of the cell and its neighbors according to the local transitions of AI. If 
the check succeeds Ai' has guessed an accepting computation path of Ai and 
accepts the input. 
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In order to perform the check each cell of M! has to be aware of the time step 
2 • n. For this purpose a deterministic FSSP algorithm HUE] is started on an 
additional track which synchronizes the cells at time 2 • n. Altogether the result 
of the check is available at time step 2 • n + 1 and needs another n — 1 time steps 
to get into the leftmost cell. We conclude that AI' has time complexity 3 • n. By 
Lemma|S|the alternation normalized UUACA Ad' can be sped up to real-time. 

It remains to show that (UUACA) C L^f(ACA). The proof is a straight- 
forward adaption of the proof that (nonuniform) ACAs are at least as powerful 
as uniform ACAs nni. □ 



Corollary 9. L(NCA) C Lrt(AOCA) 

Extending the previously mentioned chains of inclusions by the last result 
we obtain 

Lrt(CA) C L(OCA) C L(CA) C L(NCA) C Lrt(ACA) and 
Lrt(CA) C Lrt(NCA) C L(NCA) C Lrt(ACA). 

The next result shows that in both chains one of the inclusions is a proper 
one. It states Lrt(CA) C L,.t(ACA). We prove the inclusion by the use of a 
specific kind of deterministic cellular spaces as connecting pieces. A deterministic 
cellular space (CS) works like a deterministic cellular automaton. The difference 
is the unbounded number of cells. In cellular spaces there exists a so-called 
quiescent state go such that the local transition satisfies 5((?0j <Z 0 i <?o) = Qo- At 
time 0 all the cells from Z except the cells 1, ... ,n which get the input are in 
the quiescent state. Obviously, at every time step the number of nonquiescent 
cells increases at most by 2. 

In an infinite hierarchy of language families has been shown: For example, 
if r G Q, r > 1, and s G Q, e > 0, then L„r(CS) C L„r+e(CS). 

Especially, for r = 1 and e = 1 it holds Lrt(CS) C L„2(CS). 

Cellular spaces which are bounded to the left (and unbounded to the right) 
are equivalent to the original model since both halflines can be simulated on 
different tracks in parallel. Moreover, one obtains again an equivalent model if 
the number of cells is bounded by the time complexity. Let w = wi • • • Wjj be an 
input word and s : IN — >■ IN, s(n) > n, be a mapping. The family of languages 
acceptable by deterministic cellular automata with initial configuration 



co.u, : [l,...,s(|w|)] 5, co,w 



Wi if 1 < i < |w| 

go if |u>| -I- 1 < i < s(|w|) 



is denoted by Lt^s(CA). It follows immediately Lt(CS) = Lt_t(CA) (here we 
assume always S{qo,qo,#) = go)- 



Theorem 10. L^t(CA) C L,.t(ACA) 
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Proof. From Lt(CS) = Lf_t(CA) for t = id we obtain Lrt(CS) = L„^„(CA). The 
latter family is based on simultaneously n-time bounded and n-space bounded 
cellular automata, i.e. real-time (classical) cellular automata. Thus, L^t (CS) = 
L„_„(CA) = Lj.((CA). By the result in 0 for r = 1 and e = 1 it follows Lrt(CS) C 
L„2(CS) which is equivalent to Lrt(CA) C L„2(CS) = L„2 „2(CA). Now in order 
to prove the theorem we have to show L„2 „2(CA) C Lrt(ACA). 

The following construction results in an alternation normalized UUACA A 4 ' 
which simulates a simultaneously n^-time bounded and n^-space bounded CA 
M ■ Note that a deterministic computation task can per se be regarded as alterna- 
tion normalized and meets the conditions of the uniformly universal computation 
mode. 

Let Co, . . . ,c„2 denote the configurations of an accepting computation path 
of A 4 on input w = wi - ■ ■ Wn- M! gets the input Co(i) = co(i) = wt, 1 < * < n, 
and ‘knows’ co(i), n -|- 1 < f < n^, to be the quiescent state qq. 

The key idea for M' is to guess the states c„2_„(l), . . . , Cn^-nin) existentially 
during the first time step and subsequently to spawn two offspring computa- 
tions universally. One of them is the deterministic task to simulate M on input 
c„2_„(l), . . . , Cn2-n{n) for n time steps in order to check whether A 4 would ac- 
cept. The second offspring has to verify whether the guess has been correct (i.e. 
A 4 produces a corresponding configuration at time step nf — n). Therefore, at 
the third time step it guesses the states c„2_2.„(l), . . . ,c„2_2.„(2 • n) two times 
on three tracks: On one track in the compressed form (i.e. every cell contains 
two states), on another track the states c„2_2.„(l), . . . , c„2_2.„(n) and on a third 
track the states c„2_2.„(n-|-l), . . . , c„ 2 _ 2 .„( 2 -n). (Whether or not the guess yields 
two times the same sequence can deterministically be checked.) At the next time 
step M.' universally spawns three offsprings: One of them is the deterministic 
task to simulate M. on Cji2_2.„(l), . • . , Cn2-2-n(2 ■ n) for n time steps to check 
whether M. would compute the previously guessed states c„2_jj(l), . . . , c„2_„(n) 
and, thus, to verify the previous guess. The second and third offsprings have 
to verify whether the new guesses are correct. The second offspring guesses 
Cra2-3.n(l)j ■ ■ ■ : c„ 2 _ 3 .„( 2 -n) and iterates the described procedure. The third task 
has to guess the states c„2_3.„(l), . . . , c„ 2 _ 3.„(3 • n) two times at four tracks: 
In the compressed form (i.e. three states per cell) and c„2_3.„(l), . . . , c„2_3.„(n) 
and c„2_3.„(n -b 1 ), . . . , c„2_3.„(2 • n) and c„2_3.„(2n -b 1 ), . . . , c„ 2 _ 3.„(3 • n) on 
separate tracks. Now a corresponding procedure is iterated. After the guessing 
one offspring simulates M. for n time steps on c„2_3.„(l), . . . , c„ 2 _ 3.„^(3 • n) and 
another three offsprings are verifying the guesses. 

Concluding inductively at time 2 • i there exist offspring computations for 
the verification of c„2_j.„(_) • n -b 1 ), . . . , c„2_j.„((j -b 1 ) • n) where 2 < i < n and 

0 <j<i-l- 

For i = n the sequences co(j ■ n + 1 ), . . . , co((j + 1 ) • n) have to be verified. 
This can be done by checking whether the states match the initial input. For 
this reason the cells have to be aware of the time step 2 • n what can be achieved 
by providing a deterministic FSSP algorithm on an additional track as has been 
done in the previous proof. Moreover, the computations have to know to which 
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initial input symbols their sequences have to be compared. These that verify the 
sequences c„ 2 _j.„(l), . . . , c„ 2 _i.„(n) behave slightly different. They are spawn- 
ing three (instead of four) offsprings at every universal step. Since exactly the 
sequence co(l), . . . , co(n) has to be compared to the input wi ■ ■ ■ Wn whereas all 
other sequences simply have to be compared to qo,. . . ,qo, the verification can 
be done. 

The FSSP fires at time 2 • n. Afterwards the (partial) simulation of Ai needs 
another n time steps. To collect the result of that simulation and to get it into 
the leftmost cell needs at most n further time steps. Altogether, Ad' has the 
time complexity 4 • n. Following the last steps of the proof of Theorem 0 the 
alternation normalized UUACA Ai' can be sped up to real-time and one can 
conclude L{M') G Lj.t(ACA). □ 

Altogether in the previous construction the number of states of Ai' depends 
linearly on the number of states of Ai . 

Another interpretation of the last theorem is the possibility to save time and 
space simultaneously when adding alternation to a deterministic computation. 
Moreover, now we know that at least one of the following inclusions is strict: 

Lrt(CA) C Lrt(NCA) C Lrt(ACA) 
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Abstract. We show relations between new notions on cellular automata 
based on topological and measure-theoretical concepts: almost every- 
where sensitivity to initial conditions for Besicovitch pseudo-distance, 
damage spreading (which measures the information (or damage) pro- 
pagation) and the destruction of the initial configuration information. 
Through natural examples, we illustrate the links between these formal 
definitions and Wolfram’s empirical classification. 



Introduction 

A radius-r unidimensional cellular automaton (CA) is an infinite succession of 
identical finite-state machines (indexed by Z) called cells. Each finite-state ma- 
chine is in a state and these states change simultaneously according to a local 
transition function: the following state of the machine is related to its own state 
as well as the states of its 2r neighbors. A configuration of an automaton is the 
function which associates to each cell a state. We can thus define a global tran- 
sition function from the set of all the configurations into itself which associates 
the following configuration after one step of computation. 

An evolution of a unidimensional cellular automaton is usually represented 
by a space-time diagram. Being given an initial configuration, we represent in 
Z X N the cellular automaton successive configurations. 

Recently, a lot of articles proposed classifications of cellular automata ini 
Ej but the reference is still Wolfram’s empirical classification HS| which has 
resisted numerous attempts of formalization P^. The classification of Gilman jZ] 
is interesting because it is not a classification of CAs, but a classification of 
couples (CA, measure on its configuration set). This choice, not motivated in 
the paper, seems interesting because we will illustrate on an example that the 
intuitive Wolfram’s classification depends on a measure, that is a way to choose 
a random configuration. Actually, due to their local interactions, CAs are often 
used to simulate physics phenomena and many of them, for instance fluid flow, 
present a non chaotic or chaotic behavior depending on some parameters (here 
the fluid speed). 

Recently, two very different ways have been investigated to find a definition 
of chaos for cellular automata that fits with our intuition. On the one hand, 
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many people started from the mathematical definitions of chaos for dynamical 
systems and adapted them to cellular automata. By using the usual product 
topology on {0, 1}^, the shift is necessarily chaotic, and for many possible appli- 
cations of CAs, like roadways traffic for example, we see that these definitions are 
maladjusted. That is why Formenti introduced Besicovitch topology 11121 . For 
this topology, the phase space is not locally compact, thus all the mathematical 
results become wrong or at least have to be proved again. On the other hand, 
starting from the physicist approach of chaos, that is high sensitivity to initial 
conditions, Bagnoli et al. propose to measure chaos experimentally through 
Lyapunov exponents, that is, roughly, to evaluate the damage spreading speed 
when a single cell is modified. 

In this article, we will formalize both approaches to study their relationships. 
First, we will define the almost everywhere sensitivity to initial conditions for 
Besicovitch topology and then we will partition the CAs whether the default 
number in average tends to zero, is bounded or not. We will also be interested 
in the definitions of B/i-attracting sets and D/i-attracting sets m- Let us notice 
that all the definitions depend on a measure. 

We will begin by the definitions of cellular automata, Besicovitch topology 
and Bernoulli measure. In a second section, we will formalize damage spreading, 
almost everywhere sensitivity and /r-attracting sets. We will then study the re- 
lations that exist between the classes we defined. The last section speaks about 
the links with Wolfram’s classification. 

The extended version with the proofs is to be found as research report avai- 
lable by FTP El- 

1 Definitions 

1.1 Cellular Automata 

For simplicity, we will only consider unidimensional CAs in this paper. Howe- 
ver, all the concepts we introduce are topological and it seems that there is no 
problem to extend them to higher dimensional CAs. 

Definition 1. A radius-r unidimensional cellular automaton is a couple 
(Q,6) where Q is a finite set of states and S : — > Q is a transition 

function. A configuration c G of {Q, S) is a function from Z into Q and its 
global transition function Gs ■ — >■ Q’^ is such that {Gs{c)){i) = 5{c{i — 

r),...,c{i),...,c{i + r)). 

Notation 1 Let us define 




X 



y 



such that for all i G {1, ..., m}, yt = S(xi-r, ..., Xi, ..., Xi+r)- 
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Definition 2. An Elementary Cellular Automaton (EGA) is a radius-1 
two states (usually 0 and unidimensional cellular automaton. 

For EGAs, we will use Wolfram’s notation: they are represented by an integer 
between 0 and 255 such that the transition function of the CA number i whose 
writing in base 2 is z = 0706050403020100^ satisfies: 



(5i(0,0,0) = oo 
<5,(0, 0,1) = 01 
<5,(0, 1,0) = 02 
<5,(0, 1,1) = 03 



5i(l, 0, 0) = 04 
5i(l,0, 1) = 05 
5,(1, 1,0) = 06 

5i(l, 1, 1) = 07 



Let us remark that CAs with different numbers may have the same behavior by 
switching the states 0 and 1, for instance 184 = 10111000^ and 226 = 11100010^. 
If r is a rule number, we will denote f the rule after exchanging the states and 
r the rule which has a symmetric behavior (see 0 for more details) . 

We will speak about the cellular automaton 120 = ({0,l},5i2o) or equiva- 
lently of the rule 120. 



In the general definition of additive CAs due to Wolfram, an additive CA 
is a CA that satisfies the superposition principle (5(a: x' ,y y',z z') = 

5{x,y,z) 6{x' ,y' , z')). These CAs are very interesting to provide examples 

because their behavior obey algebraic rules adapted to a formal study while 
their space-time diagrams appear complicated. We will use here, like in daini, 
a more restrictive definition: 

Definition 3. We will call additive CA a unidimensional CA whose state set is 
hjn'L and whose transition function is of the form: 

S{x-i,xo, x\) = a:o -k Xi{mod n) 



1.2 Besicovitch Topology 

The most natural topology on CA configuration sets is the product topology. The 
problem is that this topology emphasizes what is happening closed to the origin 
while in many applications of CAs all the cells have the same importance. Thus, 
the adaptation of the mathematical notions of chaos to CAs for the product 
topology are not adapted: the shift is necessarily chaotic, that is not adapted to 
car traffic simulation for example. To propose more satisfying definitions of chaos, 
Formenti introduced Besicovitch pseudo-distance, that induce a shift invariant 
topology on the quotiented space: 



Definition 4. The Besicovitch pseudo-metric on is given by 
d{c, c') = lim sup 

I — >-+oo 



2^-kl 



where ((denotes the cardinality. 
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Property 1. quotiented by the relation x ^ y <1=^ d{x,y) = 0 with Besi- 
covitch topology is metric, path-wise connected, infinite dimensional, complete, 
neither separable nor locally compact Furthermore, x ^ y Gs{x) ^ 
Gs{y) and the transition function of a CA is a continuous map from Q*/ ~ into 
itself. 



Remark 1. Actually, the results of this paper are not specific to Besicovitch to- 
pology, but are true for a wide class of topologies including Weil one. A general 
study of Besicovitch like topologies has been done in |2| and an interesting que- 
stion would be to determine those that behave like Besicovitch one and what 
happens for the other ones. The only reason we point this out here is that there 
are many ways to extend Besicovitch in higher dimensional grids: the extention 
of Besicovitch pseudo-metric on is 



d(c, c') = limsup 

I >- + oo 



#{iGB{0,l)cZ^\x,^y,} 



where B{0, 1) is a ball centered at the origin and of radius I in Z" for an ar- 
bitrary chosen distance on Z", for instance di(ai, ..., a„) = |ai| -I- ... -I- |a„| or 
d 2 {ai, ...,an) = {a1 + ... + Of course different distances give different to- 

pologies but all of them are equivalent for our purpose because they differ on a 
null measure set. This is the only difficulty to extend the unidimensional concept 
to higher dimensional CAs because the definitions of measures and ergodicity 
given in the following section exist for any dimensional space. 



1.3 Measure on the Configuration Set 

Notation 2 Let Q be a finite alphabet with at least two letters. = Un>iQ" 
is the set of finite words on Q. The coordinate x(i) of a point x G will 
also be denoted Xi and x^j^^ = xj...Xk G is the segment of x between 

indices j and k. The cylinder of u G at position k G h is the set 

Let a be the shift toward the left: a{c)i = Ci+i (i.e. the rule number 85/ 

A Borel probability measure is a nonnegative function p defined on Borel 
sets. It is given by its values on cylinders, satisfies p{Q^) = 1, and for every 
u G Q+, fc G Z, 



y[uq]k = y[u]k and ^ p[qu]k = p[u]k+i 

qdQ q&Q 

Definition 5. A measure p is a-invariant if p[u]k does not depend on k (and 
will thus be denoted p[u\). 
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Definition 6. A a-invariant measure is a-ergodic if for every invariant measu- 
rable set Y {(j{Y) = Y), either fi{Y) = 0 or ^(Y) = 1. 

Bernoulli measures are the most simple, this is the reason why we will use 
them in all our examples while the definitions and probably most of the theorems 
remain true for other a-ergodic measures, for instance Markov measures (with 
correlations over a finite number of cells) or measures such that the correlation 
between two states decreases exponentially with their distance. Obviously, to 
study specific rules, like number preserving rules, Bernoulli measures are less 
interesting and we may want to consider other a-ergodic measures. But, prac- 
tically, these other measures will often be Bernoulli measures after a grouping 
operation. 

Definition 7. A Bernoulli measure is defined by a strictly positive probability 
vector {Pq)qeQ with 1 if U = Uo...Un-l G Q", p,[uQ...Un-l] = 

Puq •••Pun-1 • 

We will use the following classical result: the Bernoulli measures are a- 
ergodic. 

For 2-states CAs, the Bernoulli measures will be denoted p,p where p = Pi = 
1 — Po is the probability for a state to be 1. 




p = 0.2 p = 0.4 p = 0.5 p = 0.7 



Fig. 1. The CA T is a very simple traffic model based on the rule 184 but with two 
different models of cars. The system seems “chaotic” when the density p of cars is 
greater than or equal to 0.5 because of the traffic jams, but not “chaotic” else. Below 
the space-time diagrams (time goes toward the top), we see with a grey level the space- 
time repartition of the average number of alterations induced by the modification of 
the middle cell. 

On the figured we see a very simple example of CA that changes of behavior 
depending on the density of cars on the railway. Saying that this CA is chaotic 
or not does not make sense since it will depend on its utilization: whether it is 
used for traffic jam or for fluid traffic simulation. Its average behavior makes 
no sense since we do not explain what is a random configuration, that is which 
measure we take on its configuration set. If we assume that the cars repartition is 
initially uniform and that we have the same number of red and blue cars, we will 
consider the Bernoulli measures p* such that the probability to find a blue car in 
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a cell IS p/2 and equal to the probability to find a red car while the probability 
that there is no car is 1 — p. Now, it is possible to say (see below) that this CA 
is /r*-almost everywhere sensitive to initial conditions when p > 1/2 while it is 
p*-almost never sensitive to initial conditions else. If it is important to take into 
account the fact that a lot of people take their cars at the same time to go to 
work, other measures allow to modelize a non uniform repartition. 



2 Some Classification Tools on CAs 



2.1 Damage Spreading 

Inspired by Lyapunov exponents P, we will define the damage spreading of a 
CA via a measure. The main difference is that we count the “effective” damages 
induced by a single cell modification, for instance, if the cell modification leads 
to two alterations after t — 1 steps, that each of these alterations would change 
one state at time t but the action of both leads this state to remain the same, 
then rather than counting 2 modifications like in the Lyapunov exponents, we 
count 0 modification because the state did not change. It appears that the rule 
210 (see figure 0) has a Lyapunov exponent higher than 1 (thus the number of 
modifications is exponential if we may count a cell many times) while its damage 
spreading (the average number of different cells) is bounded. Let us now define 
this formally: 



Definition 8. Let p be a a-ergodic measure on the set of eonfigurations, let A 
be a CA and Cj\ its eonfiguration set. If c € we will define 



S 



z 



Qa 

f c(x) ifx^p 
1 s else 



that is the eonfiguration whose state of the cellp is ehanged to s. Let us now define 
the dependence coefficients which indicate the probability (according to 

p ) that the state of the cell 0 after t computation steps changes when we change 
the state at position p to q with probability tt^. Formally: 

oin,A,t,p — S I (Gg^(c))o (G^^(cp<_g))o) 

q^Qa 



Remark 2. — When no confusion about the measure and the cellular automa- 

ton is possible, we will simply denote the dependence coefficients at^p. 

— If r is the radius of the CA, p > r x t Oj p = 0 because the cell 0 
state after t computation steps is independent of the cell p of the initial 
configuration. 
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Definition 9. The damage spreading of a cellular automaton A according to a 
measure is the infinite sequence of positive real number 




ten 



This sequence is well defined for all t thanks to the previous remark: the 
are almost all equal to zero. 




Fig. 2. The rule 128 belongs to [I> A 0], 109 to \D A a] and 30 to [D A +oo]. The 
bottom diagrams represent with grey level the probability of each cell to be affected 
by the modification of the middle cell. 



This notion allows to define the class [T> A 0] of CAs whose damage sprea- 
ding tends to zero, the class [T> A -boo] of CAs whose damage spreading is not 

bounded (its limit sup tends to -boo) and the class [V A a] when the damage 
spreading limit sup tends to a finite non zero value. The figure El shows an ex- 
ample in each class. Obviously, these 3 classes define a partition of the set of 
CAs. 

Theorem 1. The additive CAs have non bounded damage spreading, they are 
in [T> A -boo] (see fl ij/ for the proof). 

2.2 /X- Almost Everywhere Sensitivity to Initial Conditions 

Let us recall the classical definition of sensitivity to initial conditions: 

Definition 10. A CA is sensitive to initial conditions for a pseudo-distance d 
if there exists a constant M > 0 such that for all e > 0 and for all configurations 
c, there exists a configuration d with d(c, d) < e and an integer n such that 
d{G'fJc),G^Jd))>M. 
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The main reason of the introduction of /x-almost everywhere sensitivity is the 
study of some particular cases. On the one hand, the rule 120 appears sensitive 
to initial conditions but it seems that there exist some very artificial configu- 
rations that stop all the information transfer so that actually the rule 120 is 
not sensitive. On the other hand, the rule 210 does not appear to be sensitive, 
but is not equicontinuous (i.e. is sensitive to initial conditions on a subset of 
its configuration set) because of the configurations 0* (see figure 0. Thus, in 
the same class, the elements of which are neither sensitive nor equicontinuous, 
we have two rules with very different behaviors. The idea is to say that 120 is 
almost everywhere sensitive, while 210 is almost never sensitive. 

To define the almost everywhere sensitivity to initial conditions, we could just 
replace “for all configurations c” by “for /r-almost all configurations c” in the 
sensitivity definition. We will give a more restrictive (because of the first point 
of the next remark) definition so that a CA that is not /r-almost everywhere 
sensitive, is “/r-almost never” sensitive to initial conditions (see the third point 
of the next remark). Furthermore, because of the kind of proof we want to do, 
it is not more difficult to prove the /r-almost everywhere sensitivity for this 
definition. 



/i-random configuration 
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Fig. 3. The configuration c' — ce + c” e is the configuration whose state at a given 
position is equal to the corresponding state of c when the corresponding state of e is 
equal to 0 and to the corresponding state of c" else. Let us remark that, due to the 
great number law, with probability 1, d{c, c') = ed{c, c") < e. 



Definition 11. A CA is ^-almost everywhere sensitive to initial eonditions (for 
Besicovitch pseudo-distance) if there exists M > 0 such that for all cq > 0, there 
exists e < €q such that if c and c" are two p-random configurations, if e is a 
Pf: random configuration and if tf = ce c"e is the configuration whose state at 
a given position is equal to the corresponding state of c when the corresponding 
state of e is equal to 1 and to the corresponding state of d' else (see figure\^, then 
with probability 1 (for pxpxp,,) there exists n such that d{Gg^{c),G^^{c')) > M. 
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Remark 3. — This definition implies that there exists M such that for /x-almost 

all configurations c and for all e > 0, there exist c' and n with d{c,d) < e 
andd(G?^(c),G^_^(c'))>M; 

— With the product topology, the previous result would imply the sensitivity 
to initial conditions, the point is that Bernoulli measures are of full support 
(i.e. the open sets have a non null measure), but it is not the case on 

with Besicovitch topology; 

— The set of configurations 3-uplets (c, c",e) such that if c' = ce + c"e there 
exists n so that d(Gg^(c), G^^(c')) > M is obviously shift invariant on (Q x 
Q X {0, 1})*. As ^ X ^ X is cr-ergodic, thus the set measure is either 1 or 0. 
So a CA is either /r-almost everywhere sensitive to initial conditions or “/i- 
almost never sensitive to initial conditions” : for any 77 there exists e such that 
if we build c, c' as usual, for any n, d(G^^(c), G^^(c')) < rj fax fix /Xe-almost 
everywhere. 

The /i-almost everywhere sensitivity to initial conditions makes sense because 
we saw that some CAs are not (obviously the rule 0 is not) and we will prove 
that the additive CAs are: 

Theorem 2. The additive CAs are fi-almost everywhere sensitive to initial eon- 
ditions for any non trivial Bernoulli measure fi (see m for the proof). 



/r-attracting sets /i-attracting sets have been defined in m- In this article, 
P. Kurka and A. Maass study the links between attracting and /r-attracting sets 
for different topologies. 

Definition 12. A sub-shift is any subset E C Q'^, which is a -invariant and 
dosed in the product topology. The language L{S) of a sub-shift E C Q*, is the 
set of factors of E. A sub-shift is of finite type (SFT), if there exists a positive 
integer p called order, such that for all c G Q'^, 

c G E 4=> Vi G Z, C[i^i_|_p_i] G L(E). 



Definition 13. For a SFT E C 

of E- defects in x by 

do{x, E) = limsup ■ 



of order p and x G Q^, define the density 

#{* G [~hl] I 2^[i,z+fc-l] ^ L{E)} 

" 27+T ■■ 



Notation 3 When d is a pseudo-distance, we can naturally define the pseudo- 
distance from an element x to a set as the inf of the pseudo-distance between x 
and the elements of the set: 

d{x,E) = inf d{x,y). 
yes 

Let us notice that do{x,E) is not associated to any pseudo-distance because 
generally doix, {y}) yf doiv, {a;}). 
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Definition 14. Let /i be a-ergodic measure, a sub-shift E is Bgi- attracting (resp. 
Dp- attracting) if 



M{cGQ^| lim d(G?(c),r) = 0}) = l 

n-^-\-oo 

when d = d{= dB),dn respectively. 



Remark j). — As the set {c G \ Hm„_>+ood(/"(c), E) = 0} is shift invariant, 

its measure is either 0 or 1. 

— From a B/r-attracting sub-shift, we can easily extract a minimal B/i-attracting 
sub-shift by considering the configurations o' of E such that 

p{{c G I liminf c?b(/"'(c), c') = 0}) = 1. And when ^ is a Bernoulli 
measure, this implies that c' is uniform (i.e. of the form q* for q G Q). 

— Furthermore, if a sub-shift is B/i-attracting, then it is D/i-attracting. 




Fig. 4. {_B*} is a B/i-attracting (thus D/i-attracting) set of the CA S which has 3 states, 
one is going to the right, one is going to the left (the third one is B, the blank one, into 
which the other state may move) and when they meet, both are annihilated. When p 
is a measure such that the number of states going to the left and to the right have the 
same probability of presence on the initial configuration, then the uniform configuration 
composed by the blank state is a B/r-attracting set. As 5 is a sub-automaton of 184^ 
where 184^ is the CA 184 whose states are grouped two by two (see j 1 2]1. we have the 
same result on 184^ for the measure so that (1,0) has probability 0 and the probability 
of (1, 1) and (0, 0) are equal. This measure on ({0, 1} x {0, 1})* corresponds to a non 
shift invariant measure on {0, 1}*, and actually, 184 has no B/r-attracting set when p is 
a non trivial shift-invariant measure. The point is that for pi /2 (so that particles going 
to the left and to the right have the same probability), the sub-shift {(01)*, (10)*} is 
D/i-attracting, but asymptotically, the configurations tend to be at pseudo-distance 
1/2 of both configurations. The rule 18 seems to be another example of CA with Dp- 
attracting sets, but no B/r-attracting set. 



The definition of B/i-attracting sets is very natural: a set is B/i-attracting 
if from almost all configurations (w.r.t. p), the (Besicovitch) pseudo-distance 
between the successive configurations and the sub-shift tends to 0. We saw in 
the remark that in this case, almost all evolutions tend to uniform configura- 
tions when /i is a Bernoulli measure. The rule 128 (see figure 0 or the CA S 
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and 184^ for some measure /r (see figure 01) have a B/r-attracting set. The de- 
finition of D/i-attracting sets is more topological because it only depends on 
the language L(S) and does not take care of how many times a pattern (a 
word of L{S)) appears. Thus, D/i-attracting sets are more powerful and al- 
low to point out homogenization process to periodical configurations and not 
only to uniform configurations. For instance, as proved in EDI, the sub-shift 
{(01)*, (10)*} is B/ii/ 2 -attracting for the rule 184. As noticed in 18 seems 
to be an example of CA with a D/i-attracting set which is not of finite type: 
{c e {0, Ij^lVz, c{2i) = 0} U (c e (0, l}*|Vi, c{2i -f 1) = 0}. 

In the following, we will only use D/i-attracting sets to point out homogeniza- 
tion to periodical configurations. In this case, we see that the whole information 
of the initial configuration is erased, formally, the metric entropy of the successive 
configurations tends to zero. 

Definition 15. Let (A, <5) he a CA and /i a a-ergodic measure, the metrie en- 
tropy of its configuration after t computation steps is defined as follow: 

lim - 

ra— >-|-oo n 

with the usual convention 0 x log{0) = 0 and where pu is the probability of 
apparition of the pattern u in the configuration c: 

P« = GKm)(Mo) 

where the notation f{p) represents, as usual, the measure defined by f{p){X) = 
p,{f~^{X)). In mathematical terminology, sj^\A) is the metric entropy of a for 
the measure G\{p). 






Definition 16. The class of CAs that erase all their initial configuration in- 
formation will be denoted — >■ 0] and formally defined as follows: a CA is in 

— >■ 0] if and only if 

5«(A) 0. 

Theorem 3. If a CA has a Dfj,-attracting set of null topological entropy, then 
it is in — >■ 0] (see J77j/ for the proof). 



Definition 17. The topological entropy of a sub-shift E is 

^ ln{ff{uGQ^ \3 cG E,C[o,\u\]=u}) 

Sj:{a) = iim . 

n — >-+oo n 
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3 Relations between Damage Spreading, fi-Ae 

Sensitiveness and the Existence of a Attracting Set 

In this section, we always assume that ^ is a Bernoulli measure. 

Theorem 4. If A ^-damage spreading tends to 0 then there exists a set of 
uniform configurations which is Bp,- attracting (see m for the proof). 

Reciprocally, there are CAs with B/r-attracting sets that are not in [2? A 0]: 
the rule S (see figure EJ with 3 states (a blank one B into which one state I goes 
to the left and a state r goes to the right, the collision of two particles leads to 
their annihilation) is in [2? A a]. In addition, we will describe later a CA that 
experimentally seems to be in [T> A +oo] but tends to a uniform configuration. 



[d 4+ a] 

I [D +oo] 




Wolfram’s empirical separation between classes 2 and 4 



Fig. 5. Relations between damage spreading, p-a,e sensitiveness and the existence of 
Bp-attracting sets when p is a Bernoulli measure 



Let us now investigate the relations between damage spreading and p-almost 
everywhere sensitiveness. The idea is to take a p-random configuration c. We 
then build c' from c: for all the cells, the state of c' is equal to the state of c 
with probability 1 — ry, is equal to q with probability r]Pq. With probability 1, 
the Besicovitch pseudo-distance d(c, cf) between c and c' is f]d{c, c") < p. So we 
can prove the following theorem. 

Theorem 5. Being p-almost everywhere sensitive to initial conditions implies 
to have non bounded damage spreading (i.e. [p — aes] C [2? A +oo]^ (see f / 1] 
for the proof). 

It is experimentally easy to observe but it seems difficult to prove that 184 
is pi/ 2 -almost everywhere sensitive to initial conditions. To understand what 
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Fig. 6. 54 and 184 for /ii /2 are in \V A +oo] PI [A 0], we do not know where is 
110 



happens, let us consider a random walker on Z starting at 0 and reading the 
initial configuration from the cell 0 toward the right (resp. left). When he reads 
1 he goes toward the right and when he reads 0, toward the left. If we change 
the cell 0 to 1 (resp. to 0), it generates at least one modification on all the 
configurations. This modification moves to the right when the random walker is 
on Z1J_, to the left when it is in ZH and do not move if the random walker is on 
0. Sometimes, because of this modification, a whole region of background shifts 
and all the cells of this region are changed. Actually, the overall evolution of 184 
leads to bigger and bigger homogeneous regions of 01* and 10*, but when we take 
two generic initial configurations, there is no reason that these regions match, 
thus, asymptotically, the pseudo-distance between the configurations after many 
computation steps is 1/2. 

It seems that the rule 54 also belongs to [/i — aes] fl [Sf_i — >■ 0] . Actually, 
particles disappearance is an irreversible phenomenon as proved in H2| that will 
occur more and more rarely when the particles ge become fewer and fewer but 
the number of ge particles tends to 0 as confirmed by 13 181 experiments. Then if 
any interaction between w and go particles occurs, one particle disappears, we 
can think that the sub-shift {0001*, 1110*} is D/ip-attracting for 0 < p < 1. 

The previous theorem raises a natural question: are the CAs with unbounded 
damage spreading p-almost everywhere sensitive to initial conditions? 

It seems that no. Let us consider a CA 54* that simulates the rule 54 but in a 
uniform background. Such a CA can be formally defined thanks to Hanson and 
Crutchfield’s filter ^ which is a CA but only on “valid” configurations. Here, we 
consider any extension of this CA for a measure such that a generic configuration 
is correct with probability 1. Experimentally, the particle number decreases like 
and thus tends to 0 so that this CA for well chosen measures tends to 
a uniform configuration. Due to the slow particle number decreasing, this CA 
seems to have non bounded damage growth: with a non null probability, a single 
perturbation creates or suppresses one or more particles. At each interaction of 
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a defect particle with a particle, the defect is duplicated, the probability of such 
collisions is linked to the density of particles, thus the average number of defects 
should look like J = 2^/x which tends to +oo when x — > +oo, this CA 
seems to have non bounded damage growth (that is experimentally observed). 
Furthermore this CA does not seem to be almost everywhere sensitive to initial 
conditions because of the following theorem: 

Theorem 6. Let A be a CA that almost everywhere tends to a uniform confi- 
guration, A is not almost everywhere sensitive to initial conditions (see mfor 
the proof). 

The question to know whether there are almost everywhere sensitive CAs 
with a B/i-attracting set is open. 




Fig. 7 . All the chaotic behaviors among 2 states unidimensional CAs 



4 Links with Wolfram’s Classification 

The CAs in [p. — aes]\[S'^ — 0] practically match on EC As with the CAs that 
Wolfram put in his class 3, they are represented on the figure 0 It is not sure that 
this remains true for more complicated (with more states, in higher dimension 
or with a bigger neighborhood) CAs because they may present a lot of different 
behaviors depending on the initial configuration. Anyway, if we assume that this 
is a good formalization. Wolfram’s observation that “the value of a particular 
site depends on an ever-increasing number of initial site values” is proved. Fur- 
thermore, we know that this condition is not sufficient to imply chaoticity. 
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Wolfram’s class 4 on EGAs seems to split into two parts, the class [/i — aes] fl 

— >■ 0] and a part of the class [V 0]. But the definition of this 

class (existence of particles) or the conjecture of the universality of its elements 
let us think that this class is completely independent of our criteria. The main 
point is that it is easy to build universal CAs in any non empty class since there 
are some in the class [T> A 0] (a CA is usually universal on a null measure set). 

The evolution of CAs that have B^-attracting sets tends ^p-almost ever- 
ywhere to a set of uniform configurations. Their behaviors look like the beha- 
viors of Wolfram’s class 1 CAs that “evolve after a finite number of time steps 
from almost all initial states to a unique homogeneous state, in which all sites 
have the same value” . Actually, it seems natural to think that the rule 128 (see 
figure I2J which erases a succession of n states 1 in n/2 time steps is in the class 1 
and this is confirmed by the examples of class 1 CAs given by Wolfram. But, if 
0* is obviously a B/i-attracting set, with probability 1, the evolution does not 
converge to 0* after a finite number of time. The point is that the probability 
to find a sequence of n successive 1 on the configuration is 1 whatever n. Next, 
Wolfram writes that “their evolution completely destroys any information on 
the initial state”. We proved this fact for CAs with B/x-attracting sets, but we 
also saw that complicated rules like 184 for p = 1/2 completely destroy any 
information on the initial state. 

The big problem would be to find a way to split the CAs whose damage 
spreading is bounded but does not tend to 0 in such a way that rules like 118 
or 109 are not together with the identity. If the behavior of the damage growth 
of a CA is almost independent of the measure we take, we can separate the 
CAs whose damage spreading is uniformly bounded from the others. This would 
separate the identity from 118 and 109, but the rule 210 would be in the second 
case, that was not really expected. 

Conclusion and Open Questions 

One of the main conclusion of this article is that our intuitive property of chaos 
does not allow to split the set of CAs into two classes because some of them 
may have a chaotic or non chaotic behavior depending on the way to choose 
a random configuration. In addition, we see that Besicovitch topology that has 
been specifically introduced in CAs to express an intuitive notion of chaos is 
effectively a very interesting notion. Note that the introduction of a measure is 
very helpful to deal with the too wide class of neither sensitive nor equicontinuous 
CAs. In this article, we introduce a new Lyapunov like notion that allows to 
measure information diffusion. This notion appears more precise than the other 
ones but still not enough to ensure a chaotic behavior. Finally, the introduced 
notions allow to formalize some of Wolfram’s observations and thus to prove how 
relevant they are. 
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A lot of open questions remains, among them 

— to find examples in or to prove the emptiness of [/i — aes] with a B/x-attracting 
set and of CAs in i 0] with no B/i-attracting set and not in [/i — aes]. 

— the generalization for more general measures (with exponentially decreasing 
correlations) of the theorems. 

— to find a “good” definition that separates the identity from 118 in [2? A a]. 
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Abstract. In this paper, we first give a brief overview of discrepancy 
theory, then introduce low-discrepancy sequences, in particular, the ori- 
ginal Faure and generalized Faure sequences. Next, we describe how to 
apply them to the problem of pricing financial derivatives, along with a 
successful application of this technique to the valuation of the present va- 
lue of mortgage-backed securities (MBS). Finally, we will discuss future 
research directions. 



1 Introduction 

Discrepancy theory is a branch of Number Theory, whose historical origin is the 
theory of uniform distribution developed by H. Weyl and other mathematicians 
in the early days of the 20th century ITEIi)l . While the latter deals with the 
uniformity of infinite sequences of points, the former with the uniformity of finite 
sequences. Finite sequences always have some irregularity or deviation from the 
ideal uniformity due to their finiteness. Discrepancy is a mathematical notion 
for measuring such irregularity. For this reason, discrepancy theory is sometimes 
called as the theory of irregularities of distribution. 

Quasi-Monte Carlo methods are one of the most successful applications of 
discrepancy theory. While Monte Carlo methods assume random numbers to pro- 
vide probabilistic error bounds via the central limit theorem, quasi-Monte Carlo 
methods use low-discrepancy sequence^ to allow deterministic error bounds via 
the Koksma-Hlawka theorem. The idea underlying low-discrepancy sequences is 
to use the point sets, not randomly distributed, but very uniformly distributed 
throughout the domain of integration. The extent to which the points are uni- 
form has been mathematically defined as their discrepancy. The more uniformly 
distributed the points are, the lower the discrepancy is. 

^ Some people call them quasi-random sequences, but this term is a misnomer, since 
low-discrepancy sequences are totally deterministic. 
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Monte Carlo simulations are a common technique in finance, particularly in 
derivative pricing and in VaR (Value at Risk) computations, since they are sim- 
ple to understand and to implement. In the area of derivative pricing, stochastic 
differential equations have been widely used as a modeling tool to describe the 
time evolution of the price of the underlying financial asset. The more com- 
plex stochastic models become, the less available analytical solutions are. Thus, 
Monte Carlo simulations are viewed as a last resort for the computation involved 
in these applications. On the other hand, the more accurate the solution is requi- 
red to be, the more computing time is needed. But, Monte Carlo simulations are 
quite unsatisfactory due to their main drawback, that is to say, their notoriously 
slow convergence rate. According to the central limit theorem, the convergence 
rate is 0(l/\/]V), where N is the number of sample points. 

The use of low-discrepancy sequences for finance problems began around 
1992, by Paskov and Traub m- They reported that quasi-Monte Carlo me- 
thods performed very well relative to simple Monte Carlo methods, as well as 
to antithetic Monte Carlo methods for pricing a ten-tranche CMC (Collaterali- 
zed Mortgage Obligation), which they obtained from Goldman-Sachs^Hl- Since 
then, many people (e.g., see the reference in EOl) have followed them and con- 
firmed their finding with different pricing problems by using different types of 
low-discrepancy sequences. 

The organization of this paper is as follows: Section 2 briefly overviews discre- 
pancy theory: the original definition of discrepancy and its several variants. In 
Section 3, we introduce quasi-Monte Carlo methods, which is the deterministic 
version of Monte Carlo methods, and describe a general method of constructing 
low-discrepancy sequences, then in detail the so-called generalized Faure sequen- 
ces. In Section 4, we give numerical experiments with the problem of computing 
the present values of MBS, along with some discussion of the results. In the last 
section, we conclude the paper with important research topics. 

2 Brief Overview of Discrepancy Theory 

2.1 Definition of Discrepancy 

The formal definition of discrepancy is as follows: For N points Ag, Xi , ..., A^v-i 
in [0, 1]^, we denote 



We define a subinterval to be J = [0, ui) x ••• x [0, Ufe), where 0 < rt/, < 1 
for 1 < h < k, and also define the characteristic function as xj(x) = 1 if 



I(/) = / f{xi,...,xk)dxi---dxk 
4 [ 0 , 1 ]'= 



and 




N-l 



( 1 ) 
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X G J; otherwise 0. We commonly have the following two different definitions of 
discrepancy: 

Definition 1. The Laa- discrepancy is defined by 

II I- QjvlU.oo =^sup| I(xj) - Qiv(xj)l, 

j 

where the supremum is extended over all subintervals J. 



Definition 2. The L 2 - discrepancy is defined by 



II I- QatIU.2 =*' [ f ( I(Xj) - QNiXj)f dui ■ ■ ■ duk] 
\4[o,i]'^ J 



1/2 



( 2 ) 



By expanding the righthand-side, the L 2 -discrepancy is explicitly written as 

f 2 ^ f 1 ^ 

II I- Q 7 VIII 2 = / K{x,y)dxdy- K{x,x,)dx + ^ K{xi,Xj), 



where 



P k 

K{s,t)= / X[o,«)(s)A[o,«)Wrfw = TT(^ “ 



(3) 



The well known lower and upper bounds for the discrepancy are as follows: 
For any N point set, the L 2 -discrepancy satisfies 



II I- 



QAr||fc,2 — 



^ N 



Roth (see, e.g., m) already proved the matching upper bound for this lower 
bound, although his proof was un-constructiv^B Regarding the Loo-discrepancy, 
we have the same lower bound as the L 2 -discrepancy, but the upper bound is 
given as 



I- Q 



N 



I (logiV)'=-i 

Ifc.oo - 0( ^ ). 



It is conjectured 0 that this order of magnitude in the upper bound is optimal, 
i.e., that the lower bound can be further improved. 



2.2 Several Variations of Discrepancy 

In what follows, we give several variations of definition of discrepancy. 

Basis functions: 

From the definition in equation (|2I), the discrepancy can be seen as an integration 

^ Henri Faure informed me that Chen and Skriganov recently found a constructive 
proof. 
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error of the indicator function. So, by choosing “basis” functions different from 
the indicator function, we have different definitions of discrepancy. One example 
is due to Paskov d, who chose 



as a basis function. Note that the case of r = 0 is equivalent to the original 
definition 0 . In this case, K{s,t) in Q is changed to 



If we regard the bivariate function K{s,t) as a reproducing kernel on some 
Hilbert space, we can derive more general version of Koksma-Hlawka theorem, 
i.e., the worst case error estimate of integration. If we see K{s,t) as a covari- 
ance kernel of some Gaussian measure, we can derive more general version of 
Wozniakowski theorem, the average case error estimate of integration. For more 
information, see Hickernell |3|. Both of Koksma-Hlawka and Wozniakowski theo- 
rems are mentioned in more detail in the next section. 

Subspaces: 

Another variant of the discrepancy is obtained by changing subintervals J (as 
well as I) to be another subspace such as half space, circle(or sphere), convex 
etc. In terms of the order of magnitude of the discrepancy, there are roughly 
two families. One is that the region J consists of scaled and translated copies of 
a fixed polygon or polytopes. The original definition is included in this family 
because J is a class of all axis-parallel boxes, in which no rotation is allowed. 
For this family, the discrepancy is bounded from above and from below by some 
constant powers of log N divided by N. For the other family, we can allow rota- 
tion beside scaling and translation. For example, arbitrarily rotated rectangular 
boxes are included in this family. Also, this family includes a class of convex 
bodies, for which Schmidt obtained a lower bound as And the 

upper bound is obtained by Beck (d = 2) as log'* N and by Stute (d > 3) 

as log'^ N, where c = 1.5 for d = 3 and c = 2/(d -|- 1) for d > 4. For 

this family, the discrepancy is bounded by the order of some negative fractional 
power of N. See m for more details. 

Nonuniform weights: 

Another interesting generalization of the definition of discrepancy is the use of 
non-uniform (or unequal) weights. Then, the equation (^ is replaced by 





N-l 






where Wi are all real numbers. If Wi = 1/N for all i, the original definition follows. 
This kind of definition is particularly useful for the error analysis in numerical 
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integration, where non-uniform weights are commonly used in quadrature (and 
cubature) formulas, e.g., Simpson rule. Gauss rule, etc. By exploiting this de- 
finition, Wasilkowski and Wozniakowski |22| obtained the following remarkable 
result: 

Theorem 1. There exists a cuhature formula Qn,w such that 

II I(XJ) - QAr.u,(Xj)IU,2 < e 

for fV(e, k) satisfying 

N{e,k) < 

where A is an absolute constant. 

If we compare the above result with the Monte Carlo methods, which need 
the number of sample points as 

N{e,k) = 0{e-^) 

to achieve the standard deviation of e, then its significance becomes clear. Ano- 
ther important point is that A is not depending on the dimension k. No similar 
results have been so far obtained for the discrapncy with uniform weights. 

Combinatorial discrepancy : 

The most abstract variant of the definition of the discrepancy is combinatorial 
discrepancy (or red-blue discrepancy): Let A be a set with cardinality n, and let 
S' be a system of subsets of A. A mapping y : A — >• {—1,1} is called a coloring. 
The discrepancy of S is defined as 

disc(S) = min disc{S,x) 

X 

where disc{S, x) = raaxAeS lx(^)l and y(A) = J2x&a xi^)- The following upper 
bound can be derived by using a random coloring: 

Theorem 2. We haue 

disc{S) = 0 ( 1/71 log m), 
where we assume A haue at most m subsets. 

Below, we give two typical examples of a set system (A, S). 

Example 1. Consider a graph (V,E). Let V he & set A, and S be a set of sub- 
sets A{v), where A[v) denotes a set of neighboring nodes oi v & V . Then, the 
discrepancy with respect to this set system (V, S) is known as a very useful mea- 
sure for the graph coloring problem, which play an important role in theoretical 
computer science. 

Example 2. Consider the set A = (1,2, ...,n} and let S' be a set of all subsets 
{a, a + b,a + 2b, ...} IT A, where a and b are any integers. The discrepancy with 
respect to this set system (A, S) is known closely related to a famous Van der 
Waerden theorem in number theory. 

More details and further discussions on discrepancy in general, see Matousek’s 
excellent bookjO]. 
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3 Quasi-Monte Carlo Methods 



3.1 Numerical Integration and Low-Discrepancy Sequences 

Finance-related Monte Carlo problems, particularly those related to derivative 
pricing, can be formulated as problems of computing multidimensional inte- 
grals. The dimension of the integration is usually equal to the number of time 
steps by which the time interval under consideration is discretized. Once a pro- 
blem can be formulated as one of numerical multidimensional integration, we 
have several “deterministic” cubature formulas. However, the direct extensions 
of one-dimensional quadrature formulas, such as trapezoidal rules, to higher- 
dimensional ones do not work well, because of the curse of dimensionality izq. 
For example, the error bound for the A:-dimensional trapezoidal rules is known to 
be which means that the error grows exponentially as the dimension 

size becomes larger. 

Low-discrepancy sequences have been considered as a promising alternative 
to high-dimensional numerical integration because its asymptotic error bound is 
0((log iV)^/7V) for the /c-dimensional integration problem. The Koksma-Hlawka 
theorem gives an important relation between discrepancy and numerical inte- 
gration jSj: 

Theorem 3 (Koksma-Hlawka). If the integrand f is of bounded variation on 
the k-dimensional unit hypercube [0,1]^ in the sense of Hardy and Krause, then 
for any Xq, Xi , ..., X]si-i G [0, 1)^ we have 

|I(/)- Qn(/)I<I|i- QwIU.oo ll/ll, 

where ||/|| is the Hardy-Krause variation of f. 

What is most important here is that the bound is a product of two separate 
elements: one is dependent only on the point set and independent on /; the other 
only on the integrand and independent on the point set. For a given integrand /, 
if we can choose a point set so as to make the discrepancy as small as possible, 
we have the smallest integration error. We also have another important result, 
the Wozniakowski theorem m 

Theorem 4 (Wozniakowski). LetCk be the class of real continuous functions 
defined on [0,1]^ equipped with the classical Wiener sheet measure w {that is, 
Gaussian with mean zero and covariance kernel 



R{s,t) f /(s)/(t) w{df) = min(s,t) 



def 



J]^min(sj,tj) 

i=i 



for any vectors s = (si,...,Sfe) and t = {ti,...,tk) in [0,1]^). Then, for a given 
set of points Xi = {xn , ..., Xik), z = 0, 1, ..., N — 1, in [0, 1]^, we have 



f (I(/)- QnU))" W{df) = \\l- 

dCk 

where || I — Q.j^\\k, 2 ,j is defined as the L 2 - discrepancy with J = {u\, 1] x • • • x 
(ufe, 1] in equation where 0 < < 1 for 1 < h < k. 
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The theorem means that on average the integration error is dependent only on 
the discrepancy, not on the integrand /. Both of the above theorems tell us that 
the lower the discrepancy is, the smaller the integration error will be. 

We now introduce the formal definition of low-discrepancy sequences: 

Definition 3. If a sequence Xo,Xi,... in [0,1]^ satisfies the condition that for 
all N > 1, the discrepancy of the first N points is 






{log NY 
N 



where Ck is a constant depending only on the dimension k, then we call it a 
low- discrepancy sequence. 

Notice that the order of magnitude (log /N in the right-hand side is believed 
to be the optimal upper bound. Therefore, the sole difference among the many 
types of low-discrepancy sequences is how small the constant factor Ck is. In this 
article, we concentrate on the construction of low-discrepancy sequences based 
on {t,k)~ sequences in base b. Before introducing them, we need the following 
definitions: 



Definition 4. A 6-ary box is an interval of the form 



k 

E=l[[akb-‘^\{ah + l)b-‘^'') 

h=l 



with integers dh > 0 and integers 0 < ah < 6'^'“ for 1 < h < /c. 



Definition 5. Let 0 < t < m be an integer. A {t, m, fc)-net in base 6 is a point 
set of 6™ points in [0,1]^ such that every 6-ary box of volume 6*“™ contains 
exactly 6‘ points of the point set. 



Now, we define {t, A:)-sequences in base 6. 

Definition 6. Let 0 < t < m be an integer. A sequence of points Ag, Ai, ..., in 
[0, 1]^ is called a {t, A:)-sequence if for all integers j > 0 and m > t, the point set 
consisting of [A„]m with j6"* < n < (j -I- 1)6"* is a {t, m, k)-net in base 6, where 
[X]m denotes the coordinate-wise m-digit truncation in base 6 of A. 

Following Sobol’ and Faure’s results, Niederreiter jHj obtained the following 
theorem for an arbitrary integer base 6 > 2: 

Theorem 5 (Niederreiter). For any N > I, the discrepancy of the first N 
points of a {t,k) -sequence in base 6 satisfies 



II I - QatIU.oo < c{t, k,b) 



{log NY 
N 



O 



l^(logA) 



k-l 



N 



where c{t,k,b) r; m( 2 TSp)''- 
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This means that if t and b are constant or depend only on k, then the (t, k)- 
sequence becomes a low-discrepancy sequence. Note that a smaller value of t 
gives a lower discrepancy asymptotically. Thus, a (0, /c)-sequence can be said as 
the best in this sense. We should notice that any subset of s < fc coordinates of 
(0, fc)-sequences constitutes (0, s)-sequences. 

3.2 Generalized Faure Sequences 

Niederreiter presented a general construction principle for (t, /c)-sequences as 
follows: Let fc > 1 and b > 2 and B = {0, — 1}. Accordingly, we define 

(i) a commutative ring R with identity and card(i?) = 5; 

(ii) bijections ipj : B ^ R for j = 1,2, ..., with tpj{0) = 0 for all sufficiently 
large j; 

(iii) bijections Xhi : i? — t i? for /i = 1, 2, ..., k and i = 1, 2, ..., with A?ii(0) = 0 
for 1 < h < k and all sufficiently large i; 

(iv) elements G R for 1 < h < k,l < i,l < j, where for fixed h and j we 
have = 0 for all sufficiently large i. 

For n = 0, 1, 2, ..., write n = ar{n)b^~^ with ar{n) G B. For /i = 1, ..., k, 

set the h-th coordinate of the point A„ in [0,1]^ as 

CXD 

2=1 



where 

Xni = e B 

for 1 < /i < fc, 1 < q and 0 < n. We call the generator matrix for 

the h-th coordinate of a {t, fc)-sequence. 

We now describe how to construct such generator matrices so that we obtain 
low-discrepancy sequences called generalized Niederreiter sequences (see Tezuka 
m)B The construction is based on GF{b, z}, i.e., the formal Laurent series 
expansions over the finite field GF{b), where 6 is a prime power. Denote S{z) G 
GF{b,z} hy 

OO 

j^W 

where all Uj G GF{b) and w is an arbitrary integer. Hereafter, we use the fol- 

def 

lowing notations: [<S'( 2 :)] denotes the polynomial part of S{z) and = 

[S'(z)] (mod p{z)) with 0 < deg([S'(z)]p(^)) < deg(p(z)). 

® In 1995, Niederreiter and Xing [HI presented a new constrnction method of low- 
discrepancy sequences based on algebraic function fields. Today, their sequences are 
known as the “theoretically” best, but there are several implementation issues to be 
overcome before these sequences become available for practical use. 
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Let k polynomials pi(z), ...,pk{z) be pairwise coprime and let et = deg{ph) > 
1 for 1 < h < k. For m > 1,1 < h < k, and j > 1, consider the expansion 



Vhm{z) 

Ph{zy 






by which the elements {j, m, r) € GF{b) are determined. Here w < 0 may 
depend on h,j,m, and each yhm{z) is a polynomial such that the residue poly- 
nomials [ 2 //im(-z)]pfe( 2 )) (j ~ l)e/i < TO — 1 < jeh, are linearly independent over 
GF{b) for any j > 0 and 1 < h < k. Then define 

Cmr = + l,TO,r) 

ioT 1 <h <k,m> 1, and r > 1, where rrih = [(to — l)/eh]- 
Tezuka m proved the following theorem: 

Theorem 6. If an integer t > 0 satisfies t > deg(pft,) — k, then the gene- 

ralized Niederreiter sequence becomes a {t,k) -sequence in base b. 



Remark 1. Faure sequences are (0, k)- sequences in a prime base b> k obtained 
from generalized Niederreiter sequences such that all yhm{z) = 1- 

This remark motivates the following definition HHEn]: 

Definition 7. Generalized Faure sequences are defined as (0,k)- sequences ob- 
tained from generalized Niederreiter sequences. 

In practice, we choose the base b to be prime, for which we have the following 
matrix representation for all generator matrices G^^\ 1 < h < k: 

(j{h) ^ j^{h) ph-i ^ ( 4 ) 

where A^^\l < h < k, are nonsingular lower triangular matrices over GF{b) 
and P is the Pascal matrix whose (i,j) element is equal to (jli). The original 
Faure sequences correspond to the case in which = I for all h. 

4 Applications to Finance 

In this section, we apply quasi-Monte Carlo methods to a problem related to 
pricing Mortgage Backed Securities (MBS) originally described by Paskov [E|. 
In the experiment, we used the original Faure sequence and a generalized Faure 
sequence with base b = 367. More precisely, nonsingular lower triangular ma- 
trices A^^\ 1 < h < k, in equation 0) for the generator matrices were chosen 
at random, and we omitted the first 100000 points; that is to say, we used the 
points Xi,i = 100001, 100002, ..., of the sequence. 
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Mortgage-Backed Securities 

MBS, the most popular fixed income derivatives, are a kind of interest-rate 
option, whose underlying asset is a pool of residential mortgage portfolios 
They have a critical feature of prepayment privileges, because householders can 
prepay their mortgages at any time. The integration problem associated with 
MBS is summarized as follows: We use the following notation: 
r^: the appropriate interest rate in month k 
Wk- the percentage prepaid in month k 
Ofc: the remaining annuity after 361 — fc months 
C\ the monthly payment on the underlying mortgage pool 

for k = 1,2, 360, where Uk = 1 + diH is constant with di = l/(H-ro) 

and To is the current monthly interest rate. C is also constant. The variable 
follows the discrete-time version of the following geometric Brownian motion: 

^2 

logrfe - logrfc_i = (a - y)Z\ -k adB, 

where Z\ = 1 and dB is the normal random variable with mean zero and variance 
A. Here, we assume zero drift (i.e., a = 0) in order to make E^r^) = tq for 
A; = 1, ..., 360. Thus, 

rk = Koexp{azk)rk-i, for k = 1,2,. ..,360, 

where Zk,k = 1,2, ...,360, are independent standard normally distributed ran- 
dom variables, and Kq = exp(— cr^/2). 

The prepayment model for the variables Wk,k = 1,2,..., 360, depends on the 
interest rate Vk, k = 1,2, ..., 360, as follows: 

Wk = Ki K 2 a,Tcta,n{Ksrk -k K 4 ), 

where Ki, K2, K3, and K4 are given constants. In practice, the lower the interest 
rate is, the higher the prepayment rate becomes. Thus, the cash flow in month 
k is 

Mk{zi, ..., Zk) = C{1 - wi) • • • (1 - Wk-i){l - Wfe + WfcOaei-fc). 

This is multiplied by the discount factor 

k — 1 ^ 

dk{zi,...,Zk-i) = — — . 

i=o ^ 

We have the following total present value of MBS: 

360 

PV{zi, ^36o) = ^ dk{zi, Zk-l)Mk{zi, Zk). 

k^l 

What we want to compute is the expected value of the present value PV over 
all independent random variables Zk,k = 1,...,360. By using the inversion of 
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the normal distribution, we can formulate this problem as one of computing a 
multivariate integration over [0, 



where = N(zfe) for fc = 1, 360. 

In this experiment, we used the parameter set (tq, Ki, K 2 , K^, K 4 , a) = 
(.00625, .24, .134, —261.17, 12.72, .2) from j2I3|, where the expected value of PV 
is numerically computed as 143.0182 x (7 by using more than one million sample 
paths. Figure 1 shows the convergence of Monte Carlo and quasi-Monte Carlo. 
The solid line (MBS.MC) shows the result obtained by the Monte Carlo me- 
thod, while the two different dotted lines (MBS.faure and MBS.gfaure) show 
the results obtained by the two quasi-Monte Carlo methods. Here, we stress two 
important points: (1) Monte Carlo vs. quasi-Monte Carlo (generalized Faure), 
and (2) original Faure vs. generalized Faure. For comparison, if we look at 1000 
samples, generalized Faure sequences converge to the correct value within an 
accuracy of 10“®, that is, 143.0182 ± 0.00143. On the other hand, the stan- 
dard deviation of PV computed from the first 1000 sample values of the Monte 
Carlo simulation is about 0.276. Thus, the 99% confidence interval is about 
143.0182 ± 2.575 * 0.276/\/1000 = 143.0182 ± 0.0225. Therefore, we can say that 
about 250 times speed-up was gained by quasi-Monte Carlo with the generali- 
zed Faure sequence for this problem. On the other hand, from the comparison 
between Monte Carlo and quasi-Monte Carlo with the original Faure, we see the 
performance of the convergence looks similar. 

5 Conclusion 

Concluding the paper, I would like to present the following two interesting que- 
stions: 



Why does QMC outperform MC so significantly for high-dimensional numerical 
integration associated with finance problems? One idea to explain this result 
is the ’’effective dimension” nsnazn, which comes from the observation that 
in many finance problems a small number of variables (say, near-future interest 
rates) are very crucial for the payoff function, while all the others are much less 
important. Thus, it is thought that even if the nominal dimension is large, say 
360 for finance problems, the effective dimension should be small. This sounds 
why QMC was so successful for finance problems with large dimensions. Several 
proposals for the formal and quantitative definitions of the effective dimension 
have been made by several authors, but, at this moment, no decisive one is yet 
available to clearly explain the QMC’s excellent performance in finance. 

Papageorgiou muni recently found that QMC also works very well com- 
pared with MC for isotropic problems in Physics. And Tezuka.j^nj pointed out 




MCvs. QMC 



S. Tezuka 




Fig. 1. Convergence of Monte Carlo and quasi-Monte Carlo methods for MBS (price 
vs. sample paths) 
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that there also exists an isotropic problem in finance for which QMC outper- 
forms MC. In these cases, the idea of effective dimension does not work because 
the importance of all the variables is the same for the isotropic integrand. The 
problem of how to explain these results is another important research topic. 

Faure vs. GFaure 

Among many classes of low-discrepancy sequences, the practical behaviors of 
convergence rate are different for one from another, even if the theoretical con- 
vergence rates, i.e., the upper bounds of the discrepancy, are exactly the same. A 
striking example of this phenomenon is the difference between practical conver- 
gence behaviors of the original Faure and generalized Faure sequences as shown 
in the preceding section. 

For generalized Faure sequences, the effect of the use of lower triangular ma- 
trices on the convergence rate has been theoretically and empirically studied 
by several researchers Assuming that the integrand is a fixed suffi- 

ciently smooth real function, their results imply that the expected integration 
error over all generalized Faure sequences becomes asymptotically 

^^(l0giV)('=-l)/2^ 

V ) ■ 

We should notice that the denominator jg much larger than N , which is 

well-known for ordinary QMC. Unfortunately, this is an asymptotic result. We 
need more elaborate on this for practical size of N and k. From another view- 
point, this can be interpreted as an existence theorem of a very good generalized 
Faure sequence for numerical multidimensional integration. Derandomization for 
finding such good sequences is very interesting in both theory and practice of 
low-discrepancy sequences. 
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Abstract. In this paper we consider four different definitions for an 
extension of a partially defined Boolean function in which the input con- 
tains some missing bits. We show that, for many general and reasonable 
families of function classes, three of these extensions are mathematically 
equivalent. However we also demonstrate that such an equivalence does 
not hold for all classes. 



1 Introduction 



A Boolean function, or a function in short, is a mapping / : B” i— >■ B, where 
B = {0, 1}. Given a function /, a Boolean vector x S B" is called its true vector, 
if f{x) = 1, and its false vector, if f{x) = 0. Let us denote the set of true vectors 
of / by T{f), and let F{f) = B" \T(/) denote the set of its false vectors. Let us 
denote by Caii the family of all Boolean functions / : B" H> B, and let us call any 
subfamily of Caii a class. We shall consider various classes of Boolean functions 
in the sequel, defined in many different ways. 

A partially defined Boolean function (a pdBfin short) is defined by a pair of 
sets (T,F) such that T,F C B". A Boolean function / is called an extension of 
the pdBf (T, E) if T C T(/) and F C F{f) hold, that is, if such an / correctly 
classifies all the vectors a G T and b G F. Let us denote by £{T, F) the family of 
extensions of the pdBf (T, F). Evidently, the disjointness of the sets T and F is a 
necessary and sufficient condition for the existence of an extension £{T,F) 0. 

It may not be evident, however, to find out if a given pdBf has an extension 
belonging to a particular class C of Boolean functions, or not. This problem has 
been studied in various fields such as learning theory, knowledge discovery, data 
mining and logical analysis of data 
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In practical cases, the fact that C fl £{T,F) = 0 might be due to some 
classification errors in the input. To correct this type of errors, provided that 
they are not in a large number, one can consider the optimization problem of 
finding the largest subsets T* C T and F* C F for which £{T*,F*) flC yf 0 
holds. These problems have extensively been studied (e.g., in ECU) for a large 
variety of classes. 

In this paper we shall consider another type of errors in the input, the case 
in which some data vectors are “incomplete” in the sense that some of their 
components are not available at the time of reading the input. Such missing 
information may either be due to some measurement errors at some earlier stages 
of data generation, or they are the results of data entry errors, or such lack of 
information might be due to the high cost of obtaining those. 

To model such situations, let us consider the set M = {0,1,*}, and let us 
interpret the asterisk components * of a vector v G M" as missing bits. Then, a 
pdBf with missing bits (or in short a pBmb) can be defined as a pair (T, F), where 
T,F C M”. Given a pBmb, it is possible to consider more than one notion of 
extensions /, depending on how to interpret *’s in the extensions; in this paper, 
we give four different definitions, two of which have already been discussed in 
P). We then prove for many important classes of functions that three of these 
definitions are equivalent. However, it is also demonstrated that this equivalence 
does not hold for all classes. 

2 Extensions of pBmbs 

For a vector v G M", let us introduce the notations ON{v) = {j \ Vj = 1, j = 
1,2,... , n| and OFF{v) = {j | = 0, } = 1, 2, . . . , n}. For a subset A C M", 

let 5'(H) = {(u,j)|?; G A, vj = *} be the collection of all missing bits of the 
vectors in A. If H is a singleton {u}, we shall also write S{v) instead of iSdu}). 
Clearly, B” C and r; G B" holds if and only if S{v) = 0. Let us consider a 
binary assignment a G B'S to a subset Q C S'(H) of the missing bits. Then 
denotes the vector obtained from v G Ahy replacing the * components which 
belong to Q by the binary values assigned by a\ 

•5 \a{v,j) A{v,j)GQ. 

Let denote the set {v°‘ \ v G A}. For example, for the set A= {u= (1, *, 0, 1), 
V = (0,1,*,*), w = (1,1,*, 0)1 C M"* we have S{A) = |(w, 2), (u,3), (r^,4), 
(in, 3)}. If Q = |(u, 2), (u, 4)}, an assignment (o(m, 2), a(u, 4)) = (1,0) G B^ 
yields = |u“ = (1, 1, 0, 1), = (0, 1, *, 0), = (1, 1, *, 0)}. 

To a pBmb {T,F) we shall always associate the set S = S{T U F) of its 
missing bits. For a pBmb (T, F) and an assignment a G let (T“, i^“) be the 
pdBf defined by T“ = |a“ \ a GT} and = |6“ | b G F}. 

Let us call a pBmb (T, F) consistent with respect to a class C of Boolean 
functions, if there exists an assignment a G B'^ for which the pdBf {T°‘,F°‘) 
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has an extension in C. A Boolean function / G £{T°‘,F°‘) DC will be called a 
consistent extension of (T,F) in the class C. 



Problem CE(C) 

Input: A pBmb (T, F), where T, F C M". 

Question: Does (T, F) have a consistent extension in class C? 

Let us note that, in case {T,F) has a consistent extension, the output of 
CE(C) might not be unique, and at an important extreme end, it may occur 
that for every possible interpretations of the missing bits the obtained pdBf has 
an extension belonging to C. Let us call a pBmb (T, F) fully consistent with the 
class C if this occurs, i.e. if (T“, F“) has an extension in C for every a G (the 
corresponding extensions may differ for different a’s.) 



Problem FC(C) 

Input: A pBmb (T, F), where T, F C M". 

Question: Is (F, F) fully consistent with the class C ? 



Let us remark that, unlike for problem CE(C), confirming a YES for problem 
FC(C) might become a computational burden because one may have to provide 
2 1'® I different extensions, for each possible assignment to the missing bits of 
(F, F), even if each extension / G f(F“, F“) flC has a small representation. For 
this reason, we shall consider a special case in which in fact all these extensions 
coincide. Let us call a Boolean function / a robust extension of a given pBmb 
(F,F) if 

/(a“) = 1 and /(&“) = 0 for all a G F, 5 G F and for all a G B"®. 

The corresponding decision problem can be stated as follows. 

Problem RE(C) 

Input: A pBmb (F, F), where F, F C M". 

Question: Does (F, F) have a robust extension in class C ? 

Let us denote by £{T,F) the family of all robust extensions of a given pBmb 
(T,F). 

Let us remark now that even in this special case, the computational verifica- 
tion of a YES may not be an easy problem. Consider, for instance, the case when 
the output function / is represented by a DNF. Then, verifying that /(a“) = 1 
holds for a vector a G F and for all a G might be as difficult as the tauto- 

logy problem, which is known to be co-NP-complete even if its input is restricted 
to 3-DNF-s (see IT^lFI 

^ The tautology problem is to decide if a given DNF (p satisfies p = 1. This is the 
complement of satisfiability problem, which is to decide, given a CNF p, if there 
exists a vector v for which <p{v) = 1. 
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We shall therefore consider a further special case, when such difficulties will 
not arise. Consider an elementary conjunction (i.e., term) 

t(x) = /\xj f\Xj 
j&P j&N 

for some subsets P,NC {1,2,... ,n} with P fl iV = 0. We shall call t a robust 
term with respect to a pBmb {T,F) and a vector a S T, if t{a°‘) = 1 for all 
a S and t{b^) = 0 for all 5 S P and (3 S Let us note that a term t is 

robust with respect to {f, F) and a S T if and only if S{a) fl (P U iV) = 0, and 
{ON{b) n A^) U (OFF{b) fl P) yf 0 for all b G F, conditions which are all easy to 
check. Let us then call a Boolean function / a very robust extension of (T,F), 
if it is a robust extension which can be represented by a disjunction of robust 
terms. 



Problem VR(C) 

Input: A pBmb (T, P), where T,F C M". 

Question: Does (T, P) have a very robust extension in class C ? 



provided. 

Let us denote by £* (T, P) the family of all very robust extensions of the pBmb 
(f,P). 

Problems CE(C) and RE(C) and some related optimization problems have 
been considered extensively for various classes in m- In this paper we concen- 
trate on the relations between FC(C), RE(C) and VR(C). It is quite immediate 
to see from the above definitions that very robust extensions are robust as well, 
and that pBmbs which have robust extensions in a given class C are also fully 
consistent with that class. 

Somewhat surprisingly, we can show that for many very general families of 
classes C, these three problems FC(C), RE(C) and VR(C) are equivalent. However 
such an equivalence does not hold for all classes. In fact, for certain classes C, 
problem RE(C) is polynomially solvable, while FC(C) is co-NP-complete. 

3 Classes of Boolean Functions 

We shall assume in the sequel that Boolean functions (functions, in short) are 
represented either by an explicit algebraic form, or by an oracle. In either case, 
it is possible to compute the values of such a function for given input vectors. 
In each case in the sequel, we shall make clear what is the representation of the 
considered family of functions. 

Let us call an elementary conjunction of literals a term. The most common 
representation we shall consider for a function / will be either a disjunctive 
normal form (or DNF in short), which is a disjunction of terms. 

For two functions / and g, we shall write / < g, if f{x) = I always im- 
plies g{x) = I, and / < g if / < <7 and / yf g. For a Boolean expression A, 
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let us denote hy A = 1 — its negation. The components of the (unknown) 
vector X = {x\, ...,Xn) will be called Boolean variables, while variables and their 
complements together are called literals. 

A term t is called an implicant of a function f ii t < f, and it is a prime 
implicant if t is a maximal implicant, i.e., t < f and no term t' exists such that 
t < < /. It is well-known that every Boolean function can be represented 

by the DNF formed by the disjunction of all of its prime implicants. It is also 
well-known that in general, there are many other DNFs representing the same 
function. 

We shall consider many different classes of Boolean functions, whose defini- 
tions will be either via some representation independent functional properties, 
or by properties of some of the DNF representations, or via some other repre- 
sentations. 

A large family of classes of the first type are the transitive or generalized 
monotone classes. Let us consider a partial order ^ of the vectors B”, and let us 
say that a function / is :< -monotone, if f{x) < f{y) holds whenever x ^ y. For 
a given partial order ^ on B”, let us denote by the family of all ^-monotone 
functions. Then, we shall call a class C transitive, if there exists a partial order 
^ on B” for which C = C^. 

Most notable examples for transitive classes are the family of positive (also 
called monotone) functions, = C>, where / G C+ if f{x) > f{y) holds whe- 
never X > y holds (componentwise), and the family Creg of regular Boolean 
functions, where Creg = C^c for the relation defined by a; ^ y if and only if 
Ei=i ^ Ej=i Vj for alll<k<n. 

Another frequently used partial order on the Boolean cube is a “tilted” mo- 
notone order. To an arbitrary vector b G B", we can associate a partial order >h 
of the Boolean cube B” by defining that v >b w holds if and only A v(Bb>w(Bb 
holds, where © denotes the exclusive-or operation (the componentwise mod 2 
addition, e.g. (1100) © (0110) = (1010)). In other words, >b is like the regular 
monotone order > in which b plays the role of the zero-vector (0,0,..., 0), and 
b is the maximum vector. The family of >{,-monotone functions will be denoted 
by C>,^. Thus, in particular C+ = C>g holds. Let us finally remark that the fa- 
mily Call of all Boolean functions itself is a transitive class, corresponding to the 
“empty” partial order on B”. 

Some other non-transitive classes, defined via a representation independent 
property can be obtained by taking the union of various transitive classes. For 
instance, a function / is called unate if it is >t,-monotone for some vector b G B". 
The family of unate functions, hence is the union of all the >t,-monotone classes, 
Cunate ~ UhgB" ^>b' 

Other examples for classes defined via a representation independent property 
include the family of self-dual functions Csd, consisting of functions / for which 
/ = f‘^, where the dual f’^ of a Boolean function / is defined by f{xi, ...,Xn) = 
f {x i, ...,x n)- Similarly, the family of dual-minor functions CD-minor consists of 
the functions satisfying the inequality / < f‘^, while the class of dual-major 
functions CD-major IS formed by the functions satisfying / > f‘^. 
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Another large family of classes, the so called DNF-classes are defined via 
their DNF representation. Let us consider a family of terms T, and let us define 
the corresponding DNF-class Cf by Ct = {f{x) = Vtes | 5 C T} as the 
collection of all Boolean functions formed by the disjunction of a subset of terms 
from T. For example, if T consists of all terms of degree at most k (elementary 
conjunctions involving at most k literals), then the corresponding DNF-class is 
the family of the so-called fc-DNFs, Ck-DNF- The special cases of linear functions 
{Ci-dnf) and quadratic functions {C 2 -dnf) should, in particular be mentioned. 
Another notable example for a DNF-class is the family of Horn functions, Cnom- 
A Boolean function / is called Horn, if it can be represented by a DNF in which 
every term involves at most one negative variable. In other words, if T is the 
family of terms involving at most one negative variable, then Horn functions 
form the corresponding DNF-class, Ct = Cnorn- 

Let us remark that DNF-classes Ct for which T is closed under consensus (for 
definition see Section Ej) will play a special role due to the property that, for such 
a DNF, all its prime implicants must also belong to T. Among consensus closed 
classes we should mention 2-DNFs, Horn functions, and >b-monotone functions. 

Given the Boolean functions / and g, we shall call g a minor of /, and will 
denote it by g C /, if g can be represented by a disjunction of some of the prime 
implicants of /. Let us then call a class C minor closed if / G C and g Q f imply 
g G C. Minor closed classes include, in particular, all consensus closed DNF 
classes, and unions of those, such as renamable Horn functions, unate functions, 
q-Horn functions (see e.g., 0), etc. 

One other important class of functions, the class Cth of threshold functions, 
is defined usually by a different representation. A Boolean function / is called 
threshold if there exist real numbers wi, ..., and wq such that f{x) = 1 if 
and only if the inequality — '^o holds. In other words, / is threshold 

exactly when the sets T{f) and F{f), viewed as point sets in the Euclidean space 
R", are linearly separable. Of course, threshold functions could also be represen- 
ted by DNFs (or CNFs), but for most threshold functions such a representation 
would be much less efficient computationally. 

4 Equivalencies between RE(C) and FC(C) 

In this section we shall show a series of results claiming, somewhat surprisingly, 
the equivalence of problems RE(C) and FC(C), under some widely applicable 
conditions. Let us remark here that two decision problems are equivalent if they 
have the same output (YES or NO) for all possible input. Equivalent decision 
problems are of course also equivalent computationally. 

Let us also note that due to the space limitations we could not include all 
the proofs here, and we refer the reader to m for the missing details. 

Let us first consider those classes C of Boolean functions which are closed 
under conjunction and disjunction; i.e., f A g G C and f V g G C for all /, g £ C. 

Theorem 1. Let us assume that the class C of Boolean functions is closed under 
conjunction and disjunction. Then problems RE(C) and FC(C) are equivalent. 
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Proof. Since the existence of a robust extension always implies full consistency, 
we only show the opposite implication. Assuming that the pBmb (T, F) is fully 
consistent with C, we show that {T, F) has also a robust extension. 

By the assumption, for every pair a € and /3 € there exists an 

extension G C n£’(T'“,F^). Let us then consider the Boolean function / 
defined by 



/= V ( A /../>)■ 

aGBS(T’) /3 gBS(f) 

We claim that / is a robust extension of (T, F) in the class C. First, f € C follows 
from our assumption that C is closed under conjunction and disjunction. 

To see that / is a robust extension of (T,F), let us first consider a vector 
a G T and an arbitrary assignment a* € Since a“ € T“ , we have 

A/ 36 bs(f) faGi 3 {a°‘*) = 1 by the fact that fa-,/3 G £’(T“*,F^) for all f3 G 
Thus f{a°‘ ) = 1 is implied, for all a* G 

Analogously, for a vector b G F and an assignment /3* G we can 

observe first that fa,/ 3 -{b^ ) = 0 holds for all a G implied again by 

fa, 13 - G £{T°‘,F^ ). Thus, in this case A/ 3 gbs(f) fa,/ 3 {b^ ) = 0 follows for all 

a G implying hence f{b^ ) = 0> for all (3* G 

These two observations then show that / is indeed a robust extension of 
(f,F). □ 

It is easy to see that transitive classes are closed under conjunction and 
disjunction (and even more, a class is transitive if and only if it is closed under 
conjunction and disjunction, see | 2 |) and hence the following corollary is implied 
by the above theorem: 

Corollary 1. Problems RE(C) and FC(C) are equivalent for all transitive clas- 
ses, including Caii, C+, Creguiar o.nd C>,, for all 6 G B". 

Let us consider next certain lattice like transitive relations. We shall say that 
a partial order ^ on B" is cube-lattice like if there is a unique ^-maximum and 
a unique ^-minimum in any subcube of B", or equivalently, if for every term t, 
there are unique vectors u,v G T(t) such that u F w F v holds for all w G T(t). 
Let us note that all partial orders mentioned in the previous section (e.g., >& for 
b G B", etc.) are cube-lattice like, and there are many others. For instance, 
an arbitrary permutation of the 2" vertices of B", viewed as a linear order, is 
cube-lattice like. 

For vectors v,w G M”, we write u ~ w if there is an assignment a G 
such that v°^ = rc“, and we say that v is potentially identical with w. For example, 
if u = (1, 0, *, 1, *) and w = (1, *, 0, 1, *) then v ~ w holds. 

To every vector a G M” we can associate the subcube P(a) = {u G B"|rt pc 
a} = {a“|a G of B" consisting of all Boolean vectors one can obtain from 

a by assigning binary values to its missing bits. Given a vector a G M" and 
a cube-lattice like partial order ^ on B", let us denote by a"*" G B{a) (resp.. 
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a~ € B{a)) the unique ^-maximal vector (resp., the unique ^-minimal vector) 
in the subcube B{a). Furthermore, for any subset S C M” let S'^ = {a+|o G S'} 
and S~ = {a“|a G Sj denote the corresponding subsets of Boolean vectors. 
With these notations, we can state the following generalization of 0 Lemma 

1]: 

Lemma 1. If C is an arbitrary subfamily of a transitive class Cy with a cube- 
lattice like partial order then a pBmb (T,F) has a robust extension in C if 
and only if the pdBf (T~ , F~^) has an extension in C. 

Proof. Since T~ = T“ for some a* G and F'^ = for some (3* G 

it follows that any robust extension of (T, F) will be an extension of 
(T“,F+), by the definition of a robust extension. 

To see the reverse direction, let us assume that / G £{T~,F'^) (IC. Since 
all functions in C are ^-monotone, and since a“ ^ a~ holds, we have f{a°‘) > 
f{a~) = 1 implied for all o G T and a G Similarly, 6^ < 6+ and = 0 

implies f{b^) < /(&'*’) = 0 for all 6 G F and /3 G Hence, this function / is 

also a robust extension of (F, F) in C. □ 

This lemma immediately implies the following statement. 

Theorem 2. If the class C is a subfamily of a transitive class Cy with a cube- 
lattice like partial order then problems RE(C) and FC(C) are equivalent. 

Proof. Indeed, if the pBmb (F, F) is fully consistent with the class C, then the 
pdBf (F“ , F+ ) has an extension / G C fl F (F“ , F ~^ ) . This / will then be a robust 
extension of (T,F) in C by Lemma ^ The converse direction is obvious by the 
definitions. □ 

Corollary 2. For any subfamily C of C+, C^eguiar ^m-d for any b G B", 
problems RE(C) and FC(C) are equivalent. 

Let us next consider DNF-classes. 

Theorem 3. Problems RE(C) and FC(C) are equivalent for all DNF-classes 
C=Cr. 

Proof. Let us only show that fully consistency implies the existence of a robust 
extension, since the converse direction is immediate from the definition. 

Observe first that, given a true vector a G F and an assignment a G 
each false vector b G F has a unique assignment f} = ft (a) G minimizing 

the Hamming distance between the Boolean vectors a“ and 6^. 

Let us fix an arbitrary vector a G F and an assignment a G and define 

ft* G as the unique assignment which coincides with ft{a) G for all 

b G F. Such an assignment obviously can be constructed by concatenating the 
/3(a) assignments for b G F, since the sets S{b) for 6 G F are pairwise disjoint. 

Since (F, F) is fully consistent with Cj by our assumption, there exists a 
Boolean function g G Cj nF(F“,F^ ). Since a“ is a true vector of such an 
extension, g must have a term ta^a G T for which ta,a{oP) = 1- 
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We claim that ta,a{b^) = 0 holds for all 6 G F and [3 G To see this, let 

us observe that for every vector b G F there must be a literal in ta,a at which 
and a“ are different, since otherwise ta^a{b^ ) = ta,a{oF) = 1 would follow, 
contradicting the fact that g is an extension of the pdBf (T“ ,F^ ). Then this 
literal does not correspond to any component of S(b), otherwise we could switch 
its value in (3* to decrease the Hamming distance to a“. Thus, this literal does 
not agree with any b^ for (3 G and hence the claim follows. 

Therefore, the Boolean function defined by 

/ = V 

aGT',aGBS(r) 



is a robust extension of {T,F) in Cf. Indeed, the equations f{b^) = 0 hold for 
all 6 G F and (3 G according to the above claim. Furthermore, for a true 

vector a G T and an arbitrary assignment a G we have = 1 implied 

by ta,a{a°‘) = 1. □ 



Corollary 3. Problems IU3,{C) andFC{C) are equivalent for Caii , C+, Ck-UNF, 

^Horn • 



Let US finally consider self-dual, dual-minor and dual-major functions. 

Theorem 4. Problems RE(C) and FC(C) are equivalent for C = Csd {resp., 
Co-minor and C D -maj or) , if ^ T U F {resp., ^ f and 

,*) ^ F). 

Corollaries □,121 El and Theorem □ together with the complexity results of 
RE(C) in |Sj, imply the following theorem. 

Theorem 5. ProblemFC{C) is polynomially decidable for C = Caii, C_|_, C^eguiar- 
Ck-DNF “ constant k), Cu-onf (for k = 1,2), CHom, 

C^D-major’ whcrc dcnotcs the class of positive functions in Cx, while it is 
co-NF -complete for C = Ck-ONF (for a constant fc > 3). 

5 Very Robust Extensions 

Very robust extensions play a computationally important role, since when they 
exists, they usually can efficiently be constructed. For instance, we shall show 
below that for most DNF-classes C, if RE(C) can be solved in polynomial time, 
then a very robust extension can also be provided at the same time. 

Let us recall that a term 



t{xi,... ,Xn)= /\ Xj l\ Xj 
j&P j&N 
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is called a robust term with respect to a G T for a pBmb (T, F), if t{a°‘) = 1 for 
all a G and t(b^) = 0 for all 6 G F and /? G In other words, if and 

only if 



P C ON(a) and N C OFF{a), and (1) 

P n OFF{b) yf 0 or TV n ON{b) yf 9textforeveryvectorb G F. (2) 

Since both of these conditions are independent of the assignments to missing bits 
of (T, F), checking these conditions is quite straightforward. Therefore, verifying 
that a given DNF is a very robust extension of a pBmb (T, F) can be done in 
linear time in the size of (T, F). It is also clear from the definition that in a very 
robust extension one never needs more than |T| terms. 

Furthermore, looking at conditions and 0, it is easy to see that finding 
a robust term for a given pBmb (T, F) and vector a G T reduces to a feasibility 
question in an associated setcovering problem, and hence it is computationally 
tractable in most cases. The above immediately imply for instance the following 
statement. 

Theorem 6. Problem VR(C) can be solved in polynomial time for C = Caii, for 
C>h for with b G B" (thus in particular for C — C+), for C — Cnom (and for all 
related classes, such as k-quasi Horn and k-quasi reverse Horn for any fixed k), 
and for C — Ck-DNF with k fixed. 

Let us recall that a class C is minor closed, if / G C and g 'F f imply 
g G C. To discuss properties of very robust extensions, we shall further recall 
the consensus method and some of its properties (see e.g., mini). Given two 
terms t = AjeP f\j^N^ j ~ f\j^P' /\j^N' ^ P they are in 

conflict at variable xj if j G (Pfl A^') U (iVnP') (i.e., if xj appears in one and x j 
appears in the other). If t and t' are in conflict at exactly one of the variables, 
then their consensus is a term t" = [t, t'] defined by 

t" = f\ Xj f\ Xj. 

je(P\N')U{P'\N) je{N\P')U(N'\P) 

In other words, the consensus of t and t' is the conjunction of all the literals 
appearing in these terms, except the two, corresponding to the conflicting varia- 
ble. It is easy to see that the inequality t” <ty t' holds, and that t” is maximal 
for this property. This implies, in particular that if t and t' are implicants of 
the Boolean function /, then their consensus t” = [t, t'] (when exists) is also an 
implicant of /. The consensus method is the algorithm, in which consensuses of 
implicants of a given DNF of / are formed as long as new implicants are gene- 
rated. It is well-known (see e.g., [I3j) that this method is complete in the sense 
that all prime implicants of / will be obtained in this way, starting from any 
DNF representation of /. Of course, all these notions and results can straight- 
forwardly be translated for CNF representations using De Morgan’s laws. The 
corresponding operation between clauses is known as resolution. 
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Returning to robust terms, we are now ready to prove the following state- 
ment. 

Lemma 2. If f is a robust extension of the pBmb (T, F), then for every vector 
a G T, f has a prime implicant ta < f , which is a robust term with respect to a. 

Proof. Let us consider an (arbitrary) DNF representation of /: 

m 

f=\ju, (3) 

2=1 



where 



= A *7 A 

j6Pi jGiVi 

Given a vector a G T, we substitute Xj = 1 for variables with j G ON (a), and 
Xj = 0 for variables with j G OFF{a) into ( 0 ). Let I C {1,2, denote the 

set of indices of those terms of o which do not vanish after this substitution. 
For terms ti for i G I, we have 

P, n OFF{a) = 0 and N, n ON{a) = 0. (5) 

Let us denote the resulting DNF by /' = where 

/\ Xj /\ Xj. (6) 

jePi\ON(a) jeNi\OFF{a) 

Since 1 = /(a“) = f'ia^^) holds for all a G it follows that /' is the 

constant 1 function, and thus the only prime implicant 1 of f can be obtained 
by a chain of consensuses, starting with the terms of f' . Let us note that if the 
terms t' and t). for some i ^ k, i,k G I have a consensus, then so do the terms 
ti and tk- Furthermore, the variables Xj, j G ON (a), appear only positively, and 
the variables Xj, j G OFF{a), appear only negatively in the resulting consensus 
[ti,tk\- Applying this observation recursively, we can repeat the same chain of 
consensuses which produced 1 from t', i G /, with the corresponding terms 
i G I, yielding an implicant t'^ of /. 

Clearly, involves only literals xj for some j G ON (a) and Xj for some 
j G OFF{a), and thus satisfies condition O- Therefore, by deleting some 
literals from t'^ if needed, we can obtain a prime implicant G of / still satisfying 

®. 

Let us note finally that any (prime) implicant of / must satisfy conditions 
(0 simply because / is a robust extension of (T, F). □ 

Theorem 7. If C is a minor closed class, then problems RE(C) and VR(C) are 
equivalent. 
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Proof. Since a very robust extension is also a robust extension, let us prove only 
the non-trivial direction of the stated equivalence. 

Let us assume that (T, F) is a pBmb which has a robust extension / G 
£{T,F) nC. According to Lemma |2l for every a G T, f has a prime implicant 
ta < f which is a robust term of (T, F). We claim that the Boolean function 

9= \J ta (7) 

aef 



is a very robust extension of (T, F) in the class C. 

Clearly, g < / is a minor of / by its definition, hence g G C is implied by the 
facts that C is minor closed and f GC. Since / G £{T,F), the inequality g < f 
also implies that g{b^) < fih^) = 0 for all 6 G A and j3 G Also, since g 

contains a robust term ta for every a G T, it follows that g{a°‘) = 1 holds for 
all a G r and a G This implies that 5 is a robust extension of the pBmb 

(T,F). Finally, since g contains only robust terms, it is a very robust extension 
of{f,F). □ 

Corollary 4. Problems RE(C) and VR(C) are equivalent for C — Caii, C>,, with 
6 G B (^thus in particular ^Hom; £r-Hom^ £unate; £q-Horn and 

Cd -minor ■ 

The next corollary follows from Corollaries 0 and 01 

Corollary 5. For the classes C = Caii, Cnorn, C+, andC 2 -DNF, problems YH{C) , 
RE(C) and FC(C) are all equivalent. 

Besides Theorem 0 we have the following complexity results from Corollary 
0and the results of RE(C) in |2|. 

Theorem 8. Problem VR(C) is polynomially solvable for C = CD-minor, while 
it is NP-hard for C = Cr-Horn 0,nd Cunate- 

6 Cases of Non-equivalence between FC(C), RE(C), and 
VR(C) 

In this section we shall show that problems FC(C), RE(C) and VR(C) are not 
always equivalent, despite the many quite general equivalences we have shown 
in the previous sections. We first give several classes C for which FC(C) and 
RE(C) are not equivalent, followed by a class for which RE(C) and VR(C) are 
not equivalent. 

First, one might think that Theorem [D could be generalized to prove the 
equivalence of RE(C) and FC(C) for classes closed under conjunction (but not 
necessarily closed under disjunction). This, however, is not the case, as the follo- 
wing simple example shows. Let us consider the class C* consisting of functions 
/ for which f{v)f{w) = 0 holds for all pairs of vectors v,w G B" which are 
at Hamming distance 1. Clearly, this class C* is closed under conjunction. Let 
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us now consider the pBmb (T,F) given by T = {(!,*)} and F = %. Since the 
equation /(I, 0) = /(1, 1) = 1 must hold for any robust extension / of (T, F), / 
does not belong to C* . Therefore, (T, F) has no robust extension in C* . However, 
/ = X\X 2 S C* is an extension of the pdBf ({(1, 1)}, 0) and g = X\X 2 S C* is an 
extension of the pdBf ({(1,O)},0). These imply that {T,F) is fully consistent 
with C* . 

Let us demonstrate next that, for the class of threshold functions Cth, pro- 
blems RE(C) and FC(C) are not equivalent. Let us recall first that a pdBf 
(T, F) has a threshold extension, if and only if there exist n -|- 1 real numbers 
wi,W 2 , • ■ • ,Wn and wo such that: 

n n 

> wq for all a G T, textand < wq for all b G F. (8) 

f=i 

It is well known that this condition is also equivalent to the disjointness of their 
respective convex hulls. 



conv{T) n conv{F) = 0, (9) 

where conv{X) denotes the convex hull of the set X in the n-dimensional real 
space. It is also easy to see by the definitions that a pBmb (T,F) has a robust 
threshold extension if and only if 

conviT) n conv{F) = 0, (10) 



where conv{X) = conv{U^^^six)X°‘) for a subset X C M". 
Let us now consider the pBmb (T, F) defined by 



1)1 

1 ( 0 , 0 , 0 , 0 ) / ’ 



F = 



( 1 , 1 , 1 , 0 )) 
( 0 , 1 , 0 ,!)^ 
( 0 , 0 , 1 , 1 ) J 



The only one missing bit of (T, F) has two possible interpretations, yielding 

= {(1,1, 1,1), (0,0, 0,0)} and T° = (0, 1, 1, 1), (0, 0, 0, 0)}. It is easy to verify 
that the threshold Boolean function defined by bxi — 3x2 ~ 3xa -I- 2x4 > 0 is 
an extension of the pdBf (T^, F), and that — 5xi -I- 2x2 + 2x3 — 8x4 > 0 defines 
a threshold extension of (F°,F). Hence, the pBmb (T,F) is fully consistent 
with Cth- However, (T,F) has no robust threshold extension by (IIOII . since the 
fractional vector (1, |, |, |) belongs to the convex hulls of both T and F. 

We can also show the non-equivalence of FC(C) and RE(C) for the classes 
such as Cy,nate cTud Cr-Horn- 

In concluding this section, we demonstrate with the example (T, F) defined 
in the table below that KEi{C 3 -dnf) and YH{C 3 -dnf) are not equivalent. 

{( 1 , 1 , 0 , 1 , 0 )) 

I (1,1,0, 0,1) I 

](l,0,l,l,l)f- 

1(0, 1,1, 1,1) J 



f ={(1,1,*, 1,1)}, F 
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This pBmb does not have a very robust 3-DNF extension, because the only 
robust term for a = ( 1 , 1 , *, 1 , 1 ) is the quartic term t = X 1 X 2 X 4 X 5 . 

On the other hand, the 3-DNF 

4> = X 1 X 2 X 3 V X 3 X 4 X 5 

is a robust extension of (T, F), hence £{T,F) (ICs-dnf 7 ^ 0- 

Let us remark that in fact, problem VR(C 3 -datf) is always polynomially 
decidable, while RE(C. 3 -_DArF) is co-NP-complete (see e.g., 0). 

7 Complexity of FC{Cth) 

We have already seen in Section that problems KE{Cth) and FC{Cth) are 
not equivalent. It is known that KE{Cth) is polynomially solvable (see e.g. |5|), 
and we can show below that problem FC{Cth) is not only inequivalent, but has 
in fact a different complexity. 

Theorem 9. Problem FC{Cth) is co-NP-complete, even |S'(a)| < 1 holds for 
all a € T U F. 

Proof. First we show that FC{Cth) belongs to co-NP. By (0, a pBmb (T, F) is 
not fully consistent with the class Cth if and only if there exists an assignment 
a G such that conv{T°‘) fl conu(F“) yf 0. Therefore, FC{Cth) is in co-NP, 
since the last condition can be checked in polynomial time (for instance by linear 
programming) . 

To prove the completeness, we reduce the following NP-complete problem to 
our problem (see e.g., ini)- 

Problem Exact Cover 

Input: A hypergraph H = {V,P[) such that V = {1,2,... ,n} and 

F[ = {Ei,E 2 , . . . , Em}, where F C P for all E G F[. 

Question: Is there an H* C Fl which exactly covers V; i.e., for which 
F n F' = 0 for all E ^ E' G H* and Ueg//* E=V1 

We may assume without loss of generality that any H* which exactly covers 
V contains Fi. This does not affect the NP-hardness of the problem, as it can 
be seen easily, since we always can modify the input by including one more 
hyperedge, E\, which is disjoint from all other hyperedges of H . 

Let V\ = {n-l-l, n-|-2, . . . , n-\-m} and V 2 = {n-l-m-|-l, n-|-m-|-2, . . . , n-|-2TO} 
and let LF = P U Pi U P 2 • We shall denote by {R; S) the vector v G for which 
ON{v) = R and S{v) = {{v,j) \j G S'}. (Then OFF{v) = P \ (i?U S); thus in 
particular, v = {R\ 0) denotes a binary vector.) Let us define a pBmb (F, F) by 
the following f,F C M^. 

f = |o(i) = (P U (n -I- 1} U (n -I- TO -I- 1}; 0)} 

U|a(i) = (|n -I- TO -I- z|; (n -I- i}), i = 2, 3, . . . , to} 

F = { 6 ( 0 ) = (0; 0)} U { 6 ( 1 ) = (Fi U {n -b 1} U P 2 ; 0)} 

U{ 6 (i) = (Fi U {n -b z}; 0), z = 2, 3, . . . , to}. 
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For this pBmb we have |S'(a)| < 1 for all a G T and S{F) = 0. Thus, we write 
simply F instead of F“, in the sequel. 

We claim that this (T, F) is not fully consistent with Cth if and only if an 
exact cover H* C H exists. This will then imply the theorem. 

First, we show the “only-if” part of the above claim. Let us assume that for 
an assignment a € the pdBf (T“, F) has no threshold extension. It follows 

from (0 that there exist nonnegative real numbers 0i (i = 1,2, .. . ,m) and fji 
(i = 0, 1, ... to) such that 

mm mm 

^ 6li = 1, ^ 77i = 1, and ^ ^ (11) 

Z — 1 2—0 2—1 2—0 

By comparing the corresponding components on the two sides of the last equality 
of (I I 1 11 . we have 



= — for i = 1,2 , ..., TO 

TO 



(12) 



= 



i if a(a(q,n + j) = 1, 
0 if a{a(i'^,n + j) = 0. 



for i = 2, 3, . . . , TO 



Moreover, % = ! — — 0 follows. 

Let us define now a family FI* C iJ by 

H* = {Ei\rj^ = —, i = 1,2, ...,to}. 



(13) 



Although the proof is omitted (see fO]), we can prove that FI* is an exact cover 
of H, which completes the “only-if” part of our claim. 

For the “if” part, take an arbitrary exact cover FI* C H, and associate an 
assignment a G A to it by defining 



a{a(i),n + i) 



1 if Ai G H* 
0 otherwise. 



It is then easy to see that with the nonnegative real numbers 
9i = — , for i = 1, 2, ..., TO, 

TO 



Vi = 



A if Ai G H*, 

m ^ ’ 

0 otherwise. 



for i = 1, 2, ..., TO, 



and 7^0 = 1 — all equations in (HU hold. Hence T“ and F are not linearly 
separable, which proves that (T,F) is not fully consistent with Cth- □ 
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8 Conclusion 

In this paper, we have considered the relations between problems FC(C), RE(C) 

and VR(C). We showed that for many general and reasonable families of classes 

such as C = Call, Cnorn, C+, and C2-DNF, these three problems are equivalent. 

We also demonstrated that such an equivalence does not hold for all classes C. 

For instance, we showed that problem RE(Ct/t) is polynomially solvable, while 

FC{Cth) is co-NP-complete. 
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Abstract. Using a random deal of cards to players and a computatio- 
nally unlimited eavesdropper, all players wish to share a common one-bit 
secret key which is information-theoretically secure from the eavesdrop- 
per. This can be done by the so-called key set protocol. In this paper 
we give a necessary and sufficient condition for a key set protocol to be 
“optimal,” that is, to succeed always in sharing a one-bit secret key. 



1 Introduction 



Suppose that there are k (> 2 ) players Pi,P2, ■ ■ ■ ,Pk and a passive eavesdrop- 
per, Eve, whose computational power is unlimited. All players wish to share a 
common one-bit secret key that is information-theoretically secure from Eve. Let 
C be a set of d distinct cards which are numbered from 1 to d. All cards in C are 
randomly dealt to players Pi,P2, - " jPk and Eve. We call a set of cards dealt to 
a player or Eve a hand. Let Ci C C he Pi’s hand, and let Cg C C be Eve’s hand. 
We denote this deal by C = (Ci, C2, • • • , Ck', Cg). Clearly {Ci, C2, • • • , Cfc, Cg} is 
a partition of set C. We write Ci = \Ci\ for each 1 < i < k and Cg = |Ce|, where 
|A| denotes the cardinality of a set A. Note that ci, C2, • • • , Cfc and Cg are the sizes 
of hands held hy Pi, P2, ■■■, Pk and Eve respectively, and that d= G + Cg. 
We call 7 = (ci, C2, • • • , Ck', Cg) the signature of deal C. In this paper we assume 
that Cl > C2 > • • • > Cfe; if necessary, we rename the players. The set C and 
the signature 7 are public to all the players and even to Eve, but the cards in 
the hand of a player or Eve are private to herself, as in the case of usual card 
games. This paper addresses protocols which make all the players share a com- 
mon one-bit secret key information-theoretically securely using such a random 
deal of cards fz!l'll 4 l, 5 lbll 0) . A reasonable situation in which such protocols are 
practically required is discussed in m, and also the reason why we deal cards 
even to Eve is found there. 

We consider a graph called a key exchange graph, in which each vertex i 
represents a player Pi and each edge (i,j) joining vertices i and j represents 
a pair of players Pi and Pj sharing a one-bit secret key £ { 0 , 1 }. Refer to 
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0 for the graph-theoretic terminology. A connected graph having no cycle is 
called a tree. If the key exchange graph is a tree, then all the players can share 
a common one-bit secret key r G {0, 1} as follows: an arbitrary player chooses 
a one-bit secret key r G {0, 1}, and sends it to the rest of the players along 
the tree; when player Pi sends r to player Pj along an edge {i,j) of the tree, 
Pi computes the exclusive-or r 0 Tij of r and and sends it to Pj, and Pj 
obtains r by computing (r©rij) ©r^. For k = 2, Fischer, Paterson and Rackoff 
give a protocol to form a tree, i.e. a graph having exactly one edge, as the key 
exchange graph by using a random deal of cards | 2 |. Fischer and Wright extend 
this protocol for any k > 2, and formalize a class of protocols called “key set 
protocols,” a formal definition of which will be given in the succeeding section jSJ 
El . We say that a “key set protocol” works for a signature 7 if the protocol always 
forms a tree as the key exchange graph for any deal C having the signature 7. 

Let Ife be the set of all signatures of deals for k players, where the total 
number d of dealt cards is not fixed but takes any value. Furthermore, let P be 
the set of all signatures where the number k of players is taken over all values, 
that is, 

00 

r= U A. 

k=2 

Define sets W and L as follows: 

VF = {7 G F I there is a key set protocol working for 7}; and 

F = {7 G F I there is no key set protocol working for 7}. 

Thus {IF, F} is a partition of set F. For k = 2, i.e. 7 G F2, Fischer and Wright 
give a simple necessary and sufficient condition for 7 G VF 0. For k > 3, the 
authors give a simple necessary and sufficient condition for 7 G IF ^0|. (These 
necessary and sufficient conditions will be described in Section E3) 

One wishes to design a key set protocol which works for all signatures 7 G IF, 
that is, always forms a tree as the key exchange graph for all deals C having any 
signature 7 G IF. Such a protocol is said to be optimal for the class of key 
set protocols 13161 . There exists an optimal key set protocol indeed: the “SFP 
protocol” given by Fischer and Wright is an example of an optimal key set 
protocol |3f6] . However, neither an optimal key set protocol other than the SFP 
protocol nor a characterization of optimal key set protocols has been known so 
far. 

In this paper, using the condition for 7 G IF in uni, we give a complete 
characterization of optimal key set protocols, that is, we give a necessary and 
sufficient condition for a key set protocol to be optimal. Using the characteriza- 
tion, we can design many optimal key set protocols. Thus we show that not only 
the SFP protocol but also many others are optimal. Using these optimal proto- 
cols, one can produce trees of various shapes as a key exchange graph; some of 
them would be appropriate for efficient broadcast of a secret message. For ex- 
ample, one can produce a tree of a small radius, as we will show later in Section 

m 
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2 Preliminaries 

In this section we explain the “key set protocol” formalized by Fischer and 
Wright, and present some of the known results on this protocol mmm . 



2.1 Key Set Protocol 

We first define some terms. A key set K = {x,y} consists of two cards x and y, 
one in Ci, the other in Cj with i ^ j, say x G Ci and y G Cj. We say that a 
key set K = {x,y} is opaque if 1 < < fc and Eve cannot determine whether 

X G Ci ov X G Cj with probability greater than 1/2. Note that both players Pi 
and Pj know that x G Ci and yGCj.HK is an opaque key set, then Pi and 
Pj can share a one-bit secret key G {0,1}, using the following rule agreed 
on before starting a protocol: = 0 if a; > y; Xij = 1, otherwise. Since Eve 

cannot determine whether Xij = 0 or Xij = 1 with probability greater than 1/2, 
the secret key r^- is information-theoretically secure. We say that a card x is 
discarded if all the players agree that x has been removed from someone’s hand, 
that is, X ^ (U^=i C'i) Ce- We say that a player Pi drops out of the protocol if 
she no longer participates in the protocol. We denote by V the set of indices i of 
all the players Pi remaining in the protocol. Note that V = (1, 2, • • • , /cj before 
starting a protocol. 

The “key set protocol” has four steps as follows. 



1. Choose a player Pg, s S E, as a proposer by a certain procedure. 

2. The proposer Ps determines in mind two cards x, y. The cards are randomly 

picked so that x is in her hand and y is not in her hand, i.e. x G Cg and 
y £ (UieF-{s} Then Pg proposes K = {x,y} as & key set to all the 

players. (The key set is proposed just as a set. Actually it is sorted in some 
order, for example in ascending order, so Eve learns nothing about which 
card belongs to Cg unless Eve holds y.) 

3. If there exists a player Pt holding y, then Pt accepts K. Since K is an opaque 
key set, Pg and Pt can share a one-bit secret key that is information- 
theoretically secure from Eve. (In this case an edge (s, t) is added to the key 
exchange graph.) Both cards x and y are discarded. Let Pi be either Pg or Pt 
that holds the smaller hand; if Pg and Pt hold hands of the same size, let Pi 
be the proposer Pg. Pi discards all her cards and drops out of the protocol. 
Set V := V — (ij. Return to step 1. 

4. If there exists no player holding y, that is. Eve holds y, then both cards x 
and y are discarded. Return to step 1. (In this case no new edge is added to 
the key exchange graph.) 

These steps 1-4 are repeated until either exactly one player remains in the 
protocol or there are not enough cards left to complete step 2 even if two or 
more players remain. In the first case the key exchange graph becomes a tree. 
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In the second case the key exchange graph does not become a connected graph 
and hence does not become a tree. 

Considering various procedures for choosing the proposer Pg in step 1, we 
obtain the class of key set protocols, where all the procedures are functions Pk — t 

y. 

2.2 Malicious Adversary 

If a key set protocol works for a signature 7, then the key exchange graph must 
become a tree for any deal C having the signature 7. Hence, whoever has the card 
y contained in the proposed key set K = {x,y}, the key exchange graph should 
become a tree. The malicious adversary determines who holds the card y. We 
use a function A Pk ^ V ^ V VJ {e} to represent a malicious adversary, where 
e is Eve’s index. The inputs to the function A(y, s) are the current signature 
7 G Ife and the index s G E of a proposer Pg chosen by the protocol. Its output 
is either the index t of a player Pt remaining in the protocol or the index e of 
Eve; A(7, s) = t ^ e means that player Pt holds card y; and A(y, s) = e means 
that Eve holds card y. 

From now on, we denote by 7 = (ci, C2, • • • , c^; Ce) the current signature, 
and denote by 7^^ = (c^, C2, • • • , Cg) the resulting signature after executing 

steps 1-4 under the assumption that Pg proposes a key set K = {x,y} and 
y G We sometimes write 7' instead of 7^^ if it is clear from context. 

Consider a signature 7 = (8, 7, 6, 4, 4, 4, 3, 2, 1; 3) as an example. Then, as 
illustrated in Fig. IH(a), the size of the hand of each player or Eve can be repre- 
sented by white rectangles. For example, if the malicious adversary A satisfies 
A{"f,2) = A(7,3) = 1, then ^[2 A) ~ 6, 4, 4, 4, 3, 2, 1; 3) as in Fig. 0b), and 

7(3,^) = (7, 7, 4, 4, 4, 3, 2,1; 3) as in Fig. CJc ). In Figs, tn^b) and (c), the shaded 
rectangles correspond to the discarded cards. 

If an optimal key set protocol chooses a proposer Pg for 7 G W, then 7^^ G 
W for any malicious adversary A; for convenience sake any signature 7 = (ci; Cg) 
with fc = 1 is assumed to be in W. 

It follows from the definition of a key set protocol that if two players Pi and 
Pj hold hands of the same size, that is, Ci = cj, then 

VA 7(.,^) eW^VA 7(,.^) e W. 

Hence, if there exist two or more players Pi with Ci = Cg (including the proposer 
Pg), then one may assume without loss of generality that Pg has the largest 
index among all these players. We call it Assumption 1 for convenience sake. 
Similarly, if A(7, s) = t ^ e and there exist two or more players Pi with Ci = Ct 
and i ^ s (including Pt), then one may assume without loss of generality that 
Pt has the largest index among all these players. We call it Assumption 2 for 
convenience sake. Under the two assumptions above, 7^^ = (c(^, C2, • • • , df.r, Cg) 

satisfies c'l > C2 > • • • > dy since 7 satisfies ci > C2 > • • • > c^. 
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(a) (b) 




(c) 



Fig. 1. The alteration of a signature. 



2.3 Feasible Players 

Fischer and Wright define a “feasible” player for a proposer as follows m Let 
fc > 3. If Ce > 1, Pi with Ci = 1 were chosen as a proposer, and Ai^^i) = e, 
then Pi's hand would become empty although she remains in the protocol, and 
hence the key exchange graph would not become a tree. On the other hand, if 
Ce = 0, then ^(7, i) ^ e and hence the protocol appears to be able to choose Pi 
with Ci = 1 as a proposer; however, if A{'^A) = j and Cj = 1, then Pj's hand 
would become empty and hence the key exchange graph would not become a 
tree. Thus the protocol can choose Pi with Ci = 1 as a proposer only if Ce = 0 
and Cj > 2 for every j such that 1 < j < k and j yf i, that is, only if Ce = 0, 
i = k and Ck-i > 2. Remember that ci > C2 > • • • > Cfe is assumed. Hence, we 
say that a player Pi is feasible if the following condition (1) or (2) holds. 

(1) c, > 2. 

(2) Ce = 0, Cj = 1 with i = k, and Ck-i > 2. 

Thus, if the hands of all the players remaining in a protocol are not empty, 
i.e. Cfe > 1, and the proposer Pg is feasible, then the hands of all the players 
remaining in the protocol will not be empty at the beginning of the succeeding 
execution of steps 1-4, i.e. c'^, > 1. Note that there will not always exist a 
feasible player at the beginning of the succeeding execution of steps 1-4 even if 
the proposer Pg is feasible. 

We define a mapping / from Pk to {0, 1, 2, • • • , k}, as follows: /(y) = i if Pi 
is the feasible player with the smallest hand (ties are broken by selecting the 
player having the largest index); and /(y) = 0 if there is no feasible player. For 
example, if y = (4, 3, 2, 2, 1, 1; 3), then /(y) = 4. If y = (4, 4, 3, 3, 1; 0), then 
/(y) = k = 5 because Ce = 0, c^ = 1 and Ck-i > 2. If y = (1,1,1; 2), then 
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fil) = 0 because there is no feasible player. Hereafter we often denote /(y) 
simply by / and /(f) by /'. 

The following Lemma ^ immediately holds 

Lemma 1 ( |3 lUJ ') The following (a)-(d) hold. 

(a) If j G W, then Cfc > 1 [3| . 

(b) If k > 3 and 7 S W, then / > 1 |3]- 

(c) If Ck > 1, then Cj = 1 for every i such that f + 1 < i < k [in|. 

(d) // / > 1 and Cf = 1 , then f = k, Ck = I, Ck-i > 2 , Ce = 0 , and 7 G IT [ 3 |- 

2.4 SFP Protocol 

Fischer and Wright give the SFP (smallest feasible player) protocol as a key 
set protocol m- The SFP protocol always chooses the feasible player with the 
smallest hand as a proposer, that is, chooses the proposer Ps as follows: 

„ = / /(7) «/ 1 < fil) < k; 

1 1 if fil) = 0 . 

Fischer and Wright show that the SFP protocol is optimal PEI- 

Theorem 2 (|PU5]) The SFP protocol is optimal. 

Not only the SFP protocol but also many other key set protocols are optimal. 
This paper provides a complete characterization of optimal key set protocols. 

2.5 Necessary and Sufficient Condition for 7 G VT 

For fc = 2, the following Theorem Qprovides a necessary and sufficient condition 
for 7 G IT 0 . 

Theorem 3 ([3j) Let k = 2. Then ^ GW if and only if C 2 > 1 and ci + C 2 > 
Ce + 2. 

For fc = 3, the following Theorem 0 provides a necessary and sufficient con- 
dition for 7 G IT m- 

Theorem 4 (|p.Uj) Let k = 3. Then "/ G W if and only if C 3 > 1 and c\ + C 3 > 
Ce + 3. 

For fc > 4, the following Theorem provides a necessary and sufficient con- 
dition for 7 G IT jmj. Hereafter let B = {i gV \ Ci = 2}, and let b = [|H|/2J . 

Theorem 5 (ini) Let fc > 4, Cfe > 1 and / > 1 . Then ^ GW if and only if 

k 

^max{ci - h+, 0 } > /, 

2=1 



( 1 ) 
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where 



and 



f = f-S, 


(2) 




(3) 


h = Ce-Ck + k-f, 


(4) 


ft'*' = ft + e. 


(5) 


Oif f = l; 

U/2 </<fc-l; 

2 if f = k and Ck-i > Cfe + 1 ; and 
iif f = k and Ck-i = Ck, 


(6) 



e = 



max{min{c 2 — ft., &}, 0 } */ 5 < / < ft — 1 ; 

max{min{c 2 — ft, & — 1}, 0} i/ 5 < / = ft and Ce > 1; and 
0 otherwise. 



(7) 



For example, one can observe a signature 7 = ( 8 , 7, 6 , 4, 4, 4, 3, 2, 1; 3) (see 
Fig. CKa)) satisfies Eq. m in Theorem 0 as follows. The signature 7 satisfies 
k = 9 and / = 8 . Thus by Eq. Q <5 = 1 . Since B = { 8 }, b — 0 and hence by 
Eq. (0) e = 0. Thus, by Eqs. (0 and 0 f = /= 8 — 1 = 7, and by Eqs. (0 and 
0 ft+ = ft = 3 — 1 + 9 — 7 = 4. Therefore, 

k 

max{ci — ft'*",0} = 4 + 3 + 2 = 9>7 = /. 



and hence the signature 7 satisfies Eq. 0. (Note that ™ax{cj — ft^,0} 
is equal to the number of rectangles above the dotted line in Fig. □Ka).) Thus 
7 G IT. 

Eq. m looks in appearance to be similar to the condition for a given degree 
sequence to be “graphical” P7im. 

Since ci > C 2 > • • • > Cfc is assumed, Eq. (0 is equivalent to 

7 

max{ci - ft+ , 0 } > / ( 8 ) 

i=l 

where the summation is taken over all 1 < z < /, although the summation in 
Eq. 0 is taken over alH, 1 < i < fc 1 1 1 )[ . 

We define 5' ,e' ,b' , f , f ,h' ,h~^' and B' for 7 ' as we did for 7 . 

3 Main Results 

In this section we give a complete characterization of optimal key set protocols. 
We first define some terms in Section o and then give the characterization in 
Section i;i.21 
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3.1 Definition of Selectable Players 

In this subsection we define a “selectable” player that can be chosen as a proposer 
by an optimal key set protocol. We will give a complete characterization of “sel- 
ectable” players in the succeeding subsection. The characterization immediately 
provides a complete characterization of optimal key set protocols. 

The SFP protocol, which always chooses the feasible player Pj with the 
smallest hand, is optimal. However, a key set protocol which chooses an arbitrary 
feasible player is not necessarily optimal. We define a “selectable” player as 
follows. 

Definition 6 We say that a player Pi is selectable for 7 if 7^ ■ G W for any 
malicious adversary A. 

When 7 G W, the proposer chosen by an optimal key set protocol is a 
selectable player, of course. Since the SFP protocol is optimal, Pf is a selectable 
player if 7 G W. 

Definition El implies that 7 G W if and only if there exists at least one 
selectable player. In other words, 7 G T if and only if there exists no selectable 
player. 

Furthermore, a key set protocol is optimal if and only if the protocol always 
chooses a selectable player as a proposer whenever such a player exists. Thus, in 
the remainder of the paper, we characterize the set of all selectable players. 

3.2 Characterization of Selectable Players 

In this subsection we give a necessary and sufficient condition for a player to be 
selectable. 

If 7 G T, then there is no selectable player. Therefore it suffices to obtain a 
necessary and sufficient condition for a player to be selectable only if 7 G W. 

We first characterize the selectable players for fc = 2 as in the following 
Theorem Q 

Theorem 7 Let k = 2 and 7 G W . Then a player Pi is selectable if and only if 
Ci >2 or Ce = 0. 

Proof. Let k = 2 and 7 G W. By Lemmata) C2 > 1. 

We first prove the necessity. Suppose for a contradiction that Cj = 1 and > 
1 although Pi is selectable. Then one may assume that i = 2 hy Assumption 1 
for convenience sake when Pi is chosen as a proposer. Since 7^2 a) ~ (ci> 0; Cg — 1) 
for an adversary A such that A(y, 2) = e, we have l [2 a) ^ by Lemma dja). 
Thus P 2 , i.e. Pi, is not selectable, a contradiction. 

We next prove the sufficiency. Assume that Ci > 2 or Cg = 0. Then it suffices 
to show that 7^^ a) ^ ^ adversary A. There are the following two cases 

to consider. 

Case 1: A{y, i) yf e. 

In this case 7' satisfies k' = 1 and hence 7' G W . 
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Case 2: = e. 

In this case Ce > 1, and hence Ci > 2 because we assumed that Cj > 2 or 
Ce = 0. If t = 1 and Ci > C2 + 1, then 7' = (ci — I,C2;Ce — 1); otherwise, 
7' = (ci,C2 — I; Ce — 1). Thus, in either case, c'l + c '2 = (ci + C2) — 1 and 
Cg = Ce — 1. On the other hand, since 7 € W, by Theorem 0 ci + C2 > Ce + 2. 
Therefore c'^ + C2 > (cg + 2) — I = Cg + 1 = Cg + 2. Furthermore, since ct > 2, 
C 2 > 1. Thus, by Theorem 0 7' G W. I 

We next characterize the selectable players for fc = 3. It has been known that, 
if Cfc > I and ci + Cfc > Cg + k, then any key set protocol choosing an arbitrary 
feasible player as a proposer works for 7 00; thus the following Lemma 0 
immediately holds. 

Lemma 8 Let Ck > 1 and ci + Cfc > Cg + k. Then every player Pi such that 
1 < i < f is selectable. 

Furthermore, it is obvious that any non-feasible player is not selectable when 
/c > 3; thus we have the following Lemma 0 

Lemma 9 Let fc > 3. Lf a player Pi is selectable, then !<*</• 

By using Theorem^ Lemmas0and0 one can easily prove that the selectable 
players for fc = 3 are characterized as in the following Theorem 1101 

Theorem 10 Let fc = 3 and 7 G W . Then a player Pi is selectable if and only 

j/1 < i < /• 

Proof. Let fc = 3 and j G W. Then by Theorem El C3 > 1 and ci + C3 > 
Cg + 3. Thus Lemma 0 implies the sufficiency. Furthermore Lemma0 implies the 
necessity. I 

We finally characterize the selectable players for fc > 4. Before giving the 
characterization, we first give some definitions. 

In a key set protocol, for every i,j G V such that i ^ j and Ci = Cj, 

Pi is selectable Pj is selectable. 

Thus, if there exist two or more players holding hands of the same size, then it 
suffices to determine whether an arbitrary player among such players is selectable 
or not. For example, if 7 = (8, 7, 6, 4, 4, 4, 3, 2, 1; 3), then one can choose Pq as a 
“representative” player among the three players Pi,Pe, and Pq who have hands 
of size 4. As in this example, we choose the player with the largest index among 
all the players holding hands of the same size as a “representative” player, and 
determine whether the chosen “representative” player is selectable or not. Let 
Vr be the set of indices of all the “representative” players. That is. 



Vr = {i GV \ i = maxX and X G V/R\ 
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where V/R is the quotient set of V under the equivalence relation R = {(i,j) € 
V xV\ci = Cj}. For example, Vr = {I, 2, 3, 6, 7, 8, 9} for the above signature 7. 
It suffices to give a necessary and sufficient condition for a player Pi, i ^Vr, to 
be selectable. Of course, such a necessary and sufficient condition immediately 
yields a complete characterization of all selectable players (whose indices are not 
necessarily in Vr). 

Let Pf^ be the player who holds the hand of the same size as Pf and has 
the smallest index, that is, 

fm = minji gV \ ci = c/}. 

From now on we define 

k 

M = max{cj — 0}. 

i=i 

Note that M is the same as the left side of Eq. (QJ in Theorem 0, We define M' 
for 7' as we did for 7. 

Define e by the following Eq. (0), which is obtained by replacing C2 with C3 
in Eq. (0): 

( max{min{c3 — h,b},0} */ 5 < / < fc — 1; 

e = < max{min{c3 — h,b — 1},0} if 5 < f = k and Cg > 1; and (9) 
[ 0 otherwise. 

Since C3 < C2, Eqs. dzj and (0) imply 

0<e<e. (10) 

Furthermore, define Conditions 1 and 2 as follows. 

(Condition 1) 

5 < f = k and Ck -2 = Cfc_i = Cfe + 1. 

(Condition 2) 

Cf ^-2 = Cf^-i = 3, \B\ is an odd number, and the following (i) or (ii) holds. 

(i) 6 < / < A: — 1 and C 2 — h > b + 1. 

(ii) 6 < / = /c, Ce > 1, 6 > 1 and C 2 — h> b. 

Define A as follows: 



A = 



2 if Condition 1 holds; 

3 if Condition 2 holds; and 
0 otherwise. 



Finally, define e as follows: 



( 11 ) 



e = 



max{min{c2 — h — l,b — 1}, 0} if f > 8, = 1 and A = 2; and 

0 otherwise. 



( 12 ) 



We are now ready to give a complete characterization of the selectable players 
for fc > 4 as in the following Theorem im 
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Theorem 11 Let fc > 4 and 7 G W. Then a player Pi sueh that i € Vr is 
seleetable if and only i/ 1 < * < / and 



C 2 — h~^ < M — f — {e — e) if i 

max{cj - (/i+ + ?+ 1), 0} > / - A - 2? 

if i = fm — ^ > 4: and A 7^ 0; and 



Ct- h+ < M - f 



otherwise. 



(13) 



If (i) j < 2 and e — e = 0, (ii) z = 3, or (iii) z > 4 and z 7^ /m — 1 or A = 0, 
then Eq. in Theorem m becomes 

i C 2 - h+ < M - Jif i < 2] and 

\c,- h+ < M - f if i>3. ^ ’ 

Note that the most of signatures satisfy e— e = A = 0 and that very few signatures 
satisfy e — e > 1 or A 7^ 0. 

Consider the signature 7 = (8, 7, 6, 4, 4, 4, 3, 2, 1; 3) as an example again (see 
Fig. da)). The signature 7 satisfies e = 0 as mentioned in Section |S1 and hence 
by Eq. (HH e-e = 0. The signature 7 does not satisfy Condition 1. Furthermore, 
since / = /m = 8, we have c/^_2 = 4 7^ 3 and hence Condition 2 does not hold. 
Therefore, by Eq. (HU A = 0. In addition, since the signature 7 satisfies / = 8, 
M = 9, f = 7 and h'*' = 4 as mentioned in Section 1231 we have M — f = 2. 
Therefore, Eq. m in Theorem m i.e. Eq. dS), implies that all the selectable 
players are the six players P^,P^,P^,Pq,P'j and Pg. These six players are the 
feasible players holding the hands whose sizes do not exceed the solid line in 
Fig.ma). 

We now intuitively explain the correctness of Theorem II II For simplicity, let 
e — e = A = 0, and consider a player Pi such that z > 2. Theorern 0 implies 
that a necessary and sufficient condition for 7 G W is that M > f, i.e. there 
are / or more rectangles above the dotted line in Fig. 0a). Thus, a signature 
7 G IT has M — f “spare” rectangles. That is, even if one removes at most 
M — f £ectangles above the dotted line, 7 still remains in W, but if one removed 
(M — /) + 1 or more rectangles above the dotted line, then 7 would be in L. 
Further, in order for a player Pi to be selectable, there must exist at least /' 
rectangles above the dotted line in the figure of 7(^ (e.g. Fig. 0(b) or (c)) for 

any malicious adversary A. For some adversary A, the number of the rectangles 
above the dotted line decreases by I + (c^ — /z+) when the proposer is Pi, as one 
can immediately observe from Fig. Mb) or (c). Note that these I + (ci — h+) 
rectangles are lightly shaded in Figs. Hb) and (c). Furthermore, since the number 
of the feasible players decreases by exactly one, we have f = / — 1. Hence, if 
a - h+ were greater than the number M — f oi the “spare” rectangles, then 
M' = M — {l + (ci — h+)}</ — 1 = /' and hence 7' would be in L. Therefore, 
a player Pi such that Ci — h~^ > M — f is not selectable. On the other hand, if 
Ci — h^ < M — f, then 7' will still remain in W, and hence a player Pi such 
that Ci~ h'^ < M — f is selectable. This is the intuitive reason why Theorem ITTI 
holds. 
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Due to the page limitation, we cannot include a proof of Theorem I1 1 1 in this 
extended abstract; see 0. 

4 Conclusion 

A key set protocol is determined by giving a procedure for choosing a proposer. 
In this paper, we defined a player to be selectable if the player can be chosen as 
a proposer by an optimal key set protocol, and gave a complete characterization 
of such selectable players in Theorem I I 1 1 Thus we succeeded in characterizing 
the set of all optimal key set protocols. 

Using Theorem ITTI one can efficiently find all selectable players in time 0{k). 
Let Pj be the selectable player having the smallest index j. Then one may 
intuitively expect that all players Pi such that j < i < f are selectable. However, 
it is surprisingly not the case. Theorem implies that all the players such that 
J < * < / and Ci ^ Cf^-i are selectable but Pj^-i may or may not be selectable. 
Consider a signature 7 = (5, 5, 5, 4, 4, 3, 3, 2, 1; 2) as an example. Then 7 satisfies 
/ = 8, /^ - 1 = 7, A = 3, /i+ = 3, M = 8, e = e = e = 0 and / = 7. Thus 
Eq. (I13II in Theorem ITTI becomes 



Therefore, P7 is not selectable, and all the selectable players are the three play- 
ers Pa,P^ and P%. As in this example, the indices of selectable players are not 
necessarily consecutive numbers. 

Using the characterization of selectable players, one can design many optimal 
key set protocols. Assume that ci = C2 = • ■ ■ = Cfc and 7 G kb. Then in most 
cases the SEP protocol forms a spanning path of length fc, i.e. a tree of radius 
[fc/2j, as the key exchange graph. On the other hand, using various optimal key 
set protocols, one can produce trees of various shapes as a key exchange graph, 
some of which would be appropriate for efficient broadcast of a secret message. 
For example, consider an optimal key set protocol which always chooses, as a 
proposer, the selectable player holding the largest hand; such a protocol forms 
a tree of a smaller radius than [fc/2j . Furthermore, we can choose the selectable 
player having the largest degree as a proposer and modify step 3 of the key set 
protocol in a way that either Pg or P( who has the smaller degree drops out 
of the protocol whenever the resulting signature remains in W ; such a protocol 
forms a tree of much smaller radius, especially when ci = C2 = • • • = Cfe is large. 
We have verified these facts by extensive computer simulation. 

This paper addresses only the key set protocol, which establishes a one-bit 
secret key. On the other hand, the “transformation protocol” given by Fischer 
and Wright 0 establishes an n-bit secret key. For a signature 7 = (3, 2; 4) G L, 
any key set protocol does not work for 7, but the transformation protocol always 
establishes a one-bit secret key for 7. However, for a signature 7 = (4, 4, 4, 4; 4) G 
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W, any optimal key set protocol works for 7, but the transformation protocol 
cannot establish a one-bit secret key for 7. Thus a protocol entirely superior to 
the key set protocol has not been known. 
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Abstract. In the framework of the Blum-Shub-Smale real number mo- 
del PI , we study the algebraic complexity of the integer linear program- 
ming problem (ILPr) : Given a matrix A £ and vectors b £ R”*, 

d £ R", decide whether there is x £ Z’^ such that Ax < b, where 
0 < X < d. The main contributions of the paper are the following: 

- An O (mlog ||d||) algorithm for ILPr, when the value of n is fixed. As 
a corollary, we obtain under the same restriction a tight algebraic com- 
plexity bound 0(log^!— ), ttmin = minfai, . . . , a„}, for the knapsack 

' ‘^min ' ' 

problem (KPr) : Given a £ R", decide whether there is a: € Z" such 
that aFx = 1. We achieve these results in particular through a careful 
analysis of the algebraic complexity of the Lovasz’ basis reduction algo- 
rithm and the Kannan-Bachem’s Hermite normal form algorithm, which 
may be of interest in its own. 

- An O (mn^ logn (n -|- log ||d||)) depth algebraic decision tree for ILPr, 
for every m and n. 

- A new lower bound for 0/1 KPr. More precisely, no algorithm can 
solve 0/1 KPr in o(nlogn) /(ai, . . . ,fln) time, even if / is an arbitrary 
continuous function of n variables. This result appears as an alternative 
to the well-known Ben-Or’s bound fi{n^) Q and is independent upon it. 
Keywords: Algebraic complexity, Complexity bounds, Integer program- 
ming, Knapsack problem 



1 Introduction 

We study the algebraic complexity of the following integer linear programming 
(ILP) problem: 

(ILPr) Given a matrix A £ and vectors b £ R"‘, d £ R", 

decide whether there is x £ such that Ax < b, where 0 < x < d. 

The input entries are arbitrary real numbers and, accordingly, the adopted model 
of computation is a real number model. This kind of model has been traditionally 

J. van Leeuwen et al. (Eds.): IFIP TCS2000, LNCS 1872, pp. 286-^^^ 2000. 
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used in scientific computing, computational geometry, and (although not expli- 
citly) numerical analysis (see, e.g., |l8ll9l22| l. In our study we conform mainly 
to the model presented in [3, known as the BSS-model (named after its creators 
Blum, Shub and Smale). In the BSS-model, the assumption is that all the real 
numbers in the input have unit size, and the basic algebraic operations / 

and the relation < are executable at unit cost. Thus the algebraic complexity 
of a computation on a problem instance is the number of operations and bran- 
chings performed to solve the instance. For more details on the BSS-model and 
complexity theory over arbitrary rings, we refer to 0. We notice that in this 
new theory, aimed at providing a complexity framework for disciplines like those 
mentioned above, an important issue is seen in the comparison of results over 
the reals with classical results over the integers, which may help elucidate some 
fundamental concepts, like computability and complexity. 

At this point it is important to mention that the requirement for bounded 
domain (i.e., 0 < a: < d) is essential and dictated by the very nature of the 
problem, namely by the fact that the coefficients may be irrational numbers. In 
such a case, a problem with unbounded domain may be, in general, undecidahle^ 
as shown in P]. 

In a classical setting, integer linear programming with integer or rational 
inputs is among the best-studied combinatorial problems. A substantial body 
of literature, impossible to report here, has been developed on the subject. In 
particular, it is well-known that ILP is NP-complete |H|. Comparatively less is 
known about the complexity of ILPr in the framework of the BSS-model. Some 
related results are reported in msnm- In |2| Blum et al. pose the problem 
of studying the complexity of an important special case of ILPr, known as the 
“real” knapsack problem: 

(KPr) Given a G R" , decide if there is a: £ Z" such that a^a; = 1. 

With the present paper we take a step towards determining ILPr’s and 
KPr’s complexity. Our main contributions are the following. 

1. An O (to log ||d||) algorithm for ILPr when the value of n is fixed (Section 

2 ). 

A similar result is known for the integer case, namely, the well-known Len- 
stra’s algorithm for ILP of a fixed dimension n H2|. Our algorithm consists 
of two stages: a reduction of the given real input to an integer input de- 
termining the same admissible set, followed by an application of Lenstra’s 
algorithm. The first stage involves simultaneous Diophantine approxima- 
tion techniques, while the second employs two well-known algorithms: the 
Lovasz’ basis reduction algorithm [El and the Kannan-Bachem’s Hermite 
normal form algorithm m- It is straightforward to obtain an upper time 
complexity bound that is quadratic in log||d||. Our more detailed analysis 
reveals that the actual complexity of the latter two algorithms (and, as a 
consequence, of the entire algorithm) is linear in log ||d||. 
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Applied to the knapsack problem KPr of fixed dimension n, our algorithm 
has complexity O ( log — ), Oniin = minjai, . . . , a„}, and turns out to be 

\ ®min j ^ 

optimal. 

In view of the fact that the Lovasz’ basis reduction algorithm and the 
Kannan-Bachem’s Hermite normal form algorithm are fundamental and very 
important combinatorial algorithms, we believe that their algebraic comple- 
xity analysis within the BSS-model may be of interest in its own. 

2. An O (rrm^ log n (n -I- log I |d| D) depth algebraic decision tree for ILPr, for 
every m and n (i.e., in a model which is nonuniform with respect to them) 
(Section 3). 

This result is in the spirit of the well-known Meyer auf der Heide’s n'^ log n-l- 
O (n^) depth linear decision tree for 0/1 KPr (i.e., KPr with x S {0, 1}" ) 

M- 

3. A new lower bound for 0/1 KPr. More precisely, no algorithm can solve 0/1 
KPr in o (nlog n) / (oi, . . . , a„) time, even if / is an arbitrary continuous 
function of n variables (Section 4). 

This result appears as an alternative to the well-known Ben-Or’s bound 
I7(u^) □ and is independent upon it, in the sense that neither of both 
results is superior to or implies the other. 

2 Analysis of the Basic Algorithms 

In this section, we analyze the Lovasz lattice basis reduction algorithm unj and 
the Kannan and Bachem’s Hermite normal form algorithm cm. It is well-known 
that these are polynomial within the classical computational model. This implies 
that, within the BSS model, they are polynomial with respect to the dimensions 
m and n of the input matrices and the maximal bit-size S of their integer (or 
rational) entries. Our deeper analysis shows that they are linear in S. 



2.1 Some Useful Facts 

In this section, we state some simple facts about vectors and matrices with ratio- 
nal entries of bit-size at most S. Although trivial, these facts will be instrumental 
in analyzing the algorithms in the next sections. 

1. Let a be a non-zero rational number. Then 1 j2^ < |a| < 2'^. 

2. Let bi, 62 be non-orthogonal n-dimensional rational vectors. Then 1 < 

1(61,62)1 < . 

3. Let H be a non-singular n x n rational matrix. Then 1 j 2"^‘® < |det {B)\ < 
n!2"‘5. 

4. Let Bi he an n X i rational matrix of rank i, i < n. Then 1 j < 
|det {B'^B) \ < R!2"(i°g’"+2‘S). 
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2.2 Lovasz Lattice Basis Reduction Algorithm 

In the description and analysis of the algorithm we follow The input consists 
of linearly independent vectors bi,b 2 , ■ ■ - bn € Q", considered as a basis for a 
lattice L. The algorithm transforms them iteratively. At the end, they form a 
basis for L which is reduced in the Lovazs sense. 

First we recall some definitions, then describe the Lovasz lattice basis re- 
duction algorithm, itself. With a basis bi,b 2 , ■ ■ - bn, we associate the orthogonal 
system bl,b 2 , ■ ■ ■b’^, where b* is the component of bi which is orthogonal to 
bi, &2I ■ ■ • bi-i- The vectors &*, b^, ■ ■ . can be computed by Gram-Schmidt or- 
thogonalization: 

K = bu 

= 2<i<n, 

where fj,ij = (^bi,b*) 

The basis 61,62, is size-reduced if all Given an arbitrary 

basis 61,62 , ... 6„, we can transform it into a size-reduced basis with the same 
Gram-Schmidt orthogonal system, as follows: 

For every i from 2 to n; For every j from i — 1 to 1; 

Set bi := bi — \^J.ij\ bj and update fii^k for 1 < fc < i — 1, by setting ^i^k = 

Now, we can describe a variant of the Lovasz lattice basis reduction algo- 
rithm. 

1. Initiation. Gompute the Gram-Schmidt quantities /i^j and 6* for 1 < j < 
i < n. Size-reduce the basis. 

2 II 1 1 2 

2. Termination condition. If ||6*|| < 2 6*_(_i for 1 < i < n — 1, then stop. 

2 II 1 1 2 

3. i?a;c6an5e step. Ghoose the smallest i such that II 6* II > 2 6*_|_j^ .Exchange 
bi and 6i_|_i. Update the Gram-Schmidt quantities. Size-reduce the basis. Go 
to 2. 



For completeness, we give formulae for updating the Gram-Schmidt quantities 
in step 3: 



I = 6* 

\new II i+1 I 



m 

\\b* I r 

lr«+i|| new 






= mf\\bu,f/wbn 



i+l\ 

m\ 



2 

new 



^new 



..new 

..new 

..new 

..new 

Tj,i+l 



f 

V j 
U 1 



for 1 < j < i — 1 





f t^3,i ^ 



for J -I- 2 < j < n. 



The other ||6*||^’s and Hij’s do not change. 

After termination of the above algorithm, we have a size-reduced basis for 

2 II 1 1 2 

which ||6*|| < 2 ||6*_|_i|| , 1 < i < n — 1. We call such a basis reduced in the 

Lovasz sense. (There are other definitions of this concept in the literature, but 
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for our purposes they are essentially equivalent.) Important properties of such a 
basis are 

ll^ill < \\[shortest vector in L)\\ , 

and 

n 

n 11^*11 < 2^ det(L). (1) 

Let us analyze the running time of steps 2 and 3. Consider the function 



n— 1 



F{bl,b;,...bl) =l[det{BfB,) 



where Bi is a matrix having 61,62 , ... as column vectors. No size-reduction 
operation changes F, as it does not change the ||6*||’s. After an exchange step, 
we obtain 









F 



m 



6*+i|r + M^+i,,||6*f ^ 3 

"" 4 - 



( 2 ) 



It is not hard to see that every iteration of steps (2-3) consists of O (n^) basic 
arithmetic operations (because of the size-reduction, an updating needs only 
O (n) such operations). The only problem might be the rounding operations [.J 
performed during the size-reductions. We observe that the absolute values of 
their arguments are at most O {nfJ-f+ii}- Then the time needed for one such an 
operation is 

O (logn-klogA<r+M) = O (logn-klog /||&*||L^„ ))) ■ 

Thus, the time complexity of one iteration is 

O {n^ (logn-klog (||6*f = O ^logn-klog^ 

Then the time complexity of all iterations is 



O iterations) log n + log 



F,; 



Fend 



(3) 



Because of 0, the number of iterations is O ^log ^ , so that the overall 
complexity of steps 2 and 3 is 



O ( log n log 



F_., 



Fend 



( 4 ) 



What remains is to estimate the running time of step 1 and the ratio . 
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By definition, /tyj ( 1 < j < * — 1 ) can be considered as a solution of the 
following linear system 

f (bi,bi) {bl,b^-l) \ / \ / (bi,bi) 

for 2 < z < n. From here and fact 4 from Section 2.1, it is not difficult to 
deduce that before the size-reduction phase of step 1, ||/Zi,j|| < 2'^('^” ). The 
size-reduction itself takes O (n^) |".J operations on these numbers, so that the 
time complexity of step 1 is clearly O 

Lastly, we need to estimate F start and F^nd- F start is a product of the deter- 
minants of n — 1 matrices Bj Bi where l<z<n— 1, so that 

To estimate Fend, let us observe that any of the vectors b^‘^, . . . is an 
integer linear combination of Therefore, for 1 < z < n, we 

have Bf^‘^ = Bn°'''^Ai, where is an rz x z integer matrix. This implies 

det = det {AjA,) det > 1 , 

by fact 4 from Section 2.1. Consequently, Fend > 1 . Thus we obtain 

log = O (jAS) . Hence, the overall complexity of the Lovasz basis reduction 
algorithm is O {^Sn^ log n) . 

Finally, we will prove that the bit-size of the entries of the reduced basis is 
O (S'rz^). Let us recall inequality (1): 




n ll^rl < 2^^^ det (L) = 2^^^ |det | , 

i=l 

and denote with a the least common multiple of all entries of Note that 

the bit-size of a is O (S'rz^). Since 6®”'^’s are integer linear combinations of &®‘“’'*’s, 
the vectors a&®"®* are integer. Therefore, we have 



niK”' 

i=l 



< 2 






det (5®*“’'*) I < 2^^^2^”'zz!2^ = 2°('^”'). 



Thus, every entry of aB^‘^ is of bit-size O (^Sn^) and so is every entry of H®"'^. 
In terms of the adopted denotations, we have proved the following lemma. 



Lemma 1. The algebraic complexity of Lovasz’ basis reduction algorithm is 
O(Sn^logn), and the bit-size of the entries in the reduced basis is O(Sn^). 
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2.3 Kannan and Bachem’s Hermite Normal Form Algorithm 



In our description we follow izq. The input for the algorithm is an m x n (m < n) 
integer matrix A of full rank. The algorithm uses the matrix 



/ 


M \ 


A 


. 


\ 


M / 



where M is the absolute value of some nonsingular m x m minor of A. A' has 
the same Hermite normal form as A. The algorithm consists of the following five 
steps: 



1 . 

2 . 

3. 



4. 

5. 



Cause all the entries of the matrix A to fall into the interval [0, M), by adding 
to the first n columns of A' proper integer multiples of the last n columns; 
For k from 1 to m do 3-4; 

If there are i ^ j, k < i,j < n + k, such that aj, ^ > a'^, j > 0, then subtract 



from the fth column the jth one multiplied by 



. Then reduce the ith 



column modulo M. Go to 3; 

Exchange the fcth column and the only column with oj, ^ > 0; 

For every i from 2 to n; for every j from 1 to f — 1, add an integer multiple 
of the ith column to the jth one, to get a' ^ > a' ^ > 0. 



In order to show that the time complexity is polynomial in m, n and linear in S, 
we need to analyze step 3. For this, we introduce the function 



^ • ■ • ^fc,n+/c) ■ 



^k,i' 



k < i < n + k 

> 0 



After one iteration of step 3, we have 








1 / 


“fc.i- 





F 



‘‘kA 



which implies both < i ^nd Fa^ ^ jg hard to see that one 

iteration of step 3 can be performed in O ^mlog = O ^mlog time. 

So, step 3 takes O (^m log Fta^t ^ Since Fstart < , F^nd > 1, and 

M = O (to! 2’”‘^) by fact 3 of Section 2.1, the overall running time of step 3 is 
O {nm (log TO -k S')). Then the complexity of the Kannan-Bachem’s algorithm is 
O {nrn? (log m + S)). 

Since all the resulting integers are smaller than M, their bit-size is O {Smn). 
Thus we have proved the following lemma. 

Lemma 2. Let A be an m x n (m < n) integer matrix of full rank. Then the 
algebraie eomplexity of the Kannan- Baehem’s algorithm that reduces A into its 
Hermite normal form, is 0{m^n(logm S)). 
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3 Basic Results about ILPr 



In this section we use the analysis of the algorithms from the previous section 
to obtain the first two of the results announced in the Introduction. 

To solve ILPr algorithmically within the BSS-model, we follow the idea of 
our method developed in ^ . There its complexity was analyzed within a strengt- 
hened version of the BSS-model, in which the floor operation [.J is considered 
as a basic one, executable at unit cost. Here we will apply and analyze it within 
the standard BSS-model. 

The algorithm employs in one of its stages the well-known algorithm for 
finding a simultaneous Diophantine approximation to a given rational vector. In 
particular, we will use the following lemma. 

Lemma 3. (see, e.g., Corollary 6.4c]) There exists a polynomial algorithm 
which, given a vector a C Q" and a rational number s, 0 < e < 1, finds an 
integral vector p and an integer q such that ||a — {l/q)p\\ < e/q, and 1 < g < 

2n(n+l)/4^— n 



Now we pass to the description of our algorithm for ILPr. As mentioned 
before, it consists of two main stages. In the first stage, the algorithm redu- 
ces the constraints with real coefficients to constraints with integer coefficients 
determining the same admissible set. The first step of this reduction is the sub- 
stitution of a given real vector with an appropriate rational vector, justified by 
the following lemma0. 

Lemma 4. Given a vector a G R" with \aj \ < l,j = l,2,...,n, and D G Z+, 
there exists an 0{n‘^\ogn{n + log!))) algorithm that finds p G Z" and q G Z+ 
such that \aj — Pj/q\ < l/{qD), j = 1,2, . . . ,n, and 1 < q < . 

Proof First we describe the algorithm finding p G and q G Z+ with the 
required properties. It consists of two basic steps. 

1. For each aj, 1 < j < n, find the closest rational fraction aj with denominator 

Q_ |-2ra(n-|-5)/4^ra+l-| ^ 

2. Apply the algorithm from Lemma 13 with input a = (oi, . . . ,a„) G Q" and 

£: = 1/{2D). The output is a vector p G Z" and an integer g G Z+ with 
||a- (l/g)p|| < l/(2g£i) and 1 < g < . 

Clearly, \aj — a j\ < 1/(2G). Then we have 



Pj 



< \aj-afi 



< xv; + 



< 



Oj Pj 

q 

1 



< \aj -aj\ + 



1 

o p 

g 

1 



< 



2G 2qD - 2.[2"("+5)/4£)"].i:) 2qD ~ qD" 
i.e., the obtained vector p and integer g are as desired. 

^ To reduce the given real constraints to an equivalent set of integer constraints, one 
can also use the approach from |2(l| . 
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Now we evaluate the algorithm’s complexity. For a given real number the 
closest rational fraction with denominator G = |'2™("+5)/4£)"+il can be found 
in time 0(log G) = 0{n^ + nlog D). Thus the overall time complexity of Step 1 
is 0(n^ + log D) 0 

Step 2 involves the simultaneous Diophantine approximation algorithm ap- 
plied to the particular class of inputs a G Q", e = 1/{2D) obtained in Step 1. As 
a matter of fact, this algorithm is a specialization of the Lovasz basis reduction 
algorithm, applied to a matrix of the form 

/'I ai ^ 

1 tt2 

1 CLji 

\ ^gJ 

where ai, 02 , . . . , a„ are rational numbers, all of them with the same denominator 
G = following bound on the number of iterations holds. 

Lemma 5. (see ^ Lemma 4-4J) 2 of the algorithm, 0(log ) = 

0{nP + 'o? logD) iterations are performed. 

From (4) and Lemma 0 we obtain that the time complexity of Step 2 is 

O (n^ log nlog ] = 0(n^ logn(n^ -|- log D)) = 

\ Lend ) 

= 0(n'*logn(n -|- logD)). 

Thus the overall time complexity of the algorithm is 0{n'^logn{n + logD)). □ 

The algorithm of Lemma 0 can be used to substitute any real constraint 
ax < b with an integer one, preserving the same admissible integer points x with 
0 < X < d, d G R". More precisely, we have the following lemma. 

Lemma 6. Let T = {x G : ax < b',0 < d}, where a G R", b G R, d G Z” . 
Then there exists an algorithm which finds a vector r G Z" and a number tq G Z 
such that T = {x G Z" \ rx < xq] 0 < x < d} . The algorithm involves at most n 
applications of the algorithm from Lemma^ with D = ||(i||. 

Proof of the above fact is available in 0 Lemma 5.1]. 

From Lemmas 0 and 0 we obtain that the overall time complexity of the 
reduction stage is 0(mn^ log n(n -I- log ||d||)). Furthermore, the bit-size of the 
generated integers is 0(n^(n -|- log ||d||)). Therefore, the overall bit-size of the 
reduced problem is 0(mn^(n -|- log ||d||)). 

At this point, the second of the announced results follows immediately. First 
we unfold the applications of the Lovasz basis reduction algorithm in an algebraic 

^ Note that in the BSS-model extended with a unit cost floor operation (the case 
handled in 0) this requires 0{n) operations. 
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decision tree with depth 0{mn^ log n{n + log||(i||)). After that, we branch on 
every bit of the obtained integer data problem, which adds 0(mn^(n + log ||d||)) 
to the depth of the tree. Thus we obtain the following theorem. 

Theorem 1. There is an 0(mn® log n(n + log | |(i| |)) algebraic decision tree for 
ILPr. 

To obtain the other result of ours, we continue with the second stage of 
the algorithm. That stage is an application of the Lenstra’s algorithm to the 
integer data problem obtained as output of the first stage. A recursive step of 
this algorithm reduces an n-dimensional problem to a set of subproblems of 
dimension n — 1, whose number is exponential but depending only on n. The 
basic algorithms used in this reduction are the Lovasz basis reduction algorithm 
and the Kannan-Bachem’s Hermite normal form algorithm. In addition, a linear 
programming problem of dimension (m + 2n) x n is to be solved. 

The two algorithms are applied to matrices of dimension depending only on n 
and with entries of bit-size 0(log ||d||), as the value of n is fixed. Then, by Lem- 
mas Hand 0 their complexity as well as the bit-size of the integers they generate, 
are bounded by 0(log ||d||). The linear programming problem can be solved in 
time 0(m + n) (i.e., linear in m) using the well-known Megiddo’s algorithm 1 1 (ij . 
Hence, if n is fixed, the overall complexity of this stage is 0(mlog ||d||). Thus 
we have obtained the following theorem. 

Theorem 2. There is an 0{mlog ||d||) algorithm for ILP^ of fixed dimension 
n. 

Theorem0implies a tight bound for the algebraic complexity of the knapsack 
problem. 

Corollary 1. The algebraic complexity of the knapsack problem KP^ of fixed 
dimension n is 0(log 

Proof An upper bound 0(log follows from Theorem 0 A lower bound 
I2(log follows from 0, where a tight bound 6>(log ) is proved for the 
algebraic complexity of the two-dimensional knapsack problem with real coeffi- 
cients. □ 

Regarding possible practical applications, one can expect that the proposed 
algorithm for ILPr may be useful for problems with a small number of variables 
and a large number of constraints. 



4 Lower Bound for the Knapsack Problem 

In this section we study the complexity of the Boolean knapsack problem 

(0/1 — KPr) Given a S R”, decide if there is x S {0, 1}" such that = 1. 

In the classical setting, the knapsack problem has been studied intensively (see 
m and the bibliography therein). In particular, the problem is NP-complete 
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0. Regarding “real” knapsacks, a number of results have been proved. Notable 
among them are the lower bounds I7(n^ log and I7(n^) for KPr’s and 0/1- 
KPr’s complexity, respectively (see 0). In |3 the topological complexity of the 
latter problem is found. provides a parallel time lower bound. Some other 
results are presented in ITiaSI . For a related discussion the reader is also referred 
to 0. 

In this section, we take one more step towards determining the algebraic 
complexity of the knapsack problem. We obtain a result which complements the 
above mentioned Ben-Or’s lower bounds. More precisely, we have the following 
theorem. 

Theorem 3. No algorithm solving 0/1— KPr, can achieve a time complexity 
o(nlogn) • /(fli, . . . , a„), where f is an arbitrary continuous function of n va- 
riables. 

Remark 1. The requirement for / to be a continuous function is essential, as 
follows from 0. More precisely, it has been shown that there is an 0{nj^) 

algorithm for 0/1— KPr, where 5{a) = min{|a^z| : yf 0,z € {—1,0, 1}"}. 

Proof Assume the opposite, i.e., that there is an algorithm A that solves 0/1- 
KPr in o(nlogn) • /(ai, . . . ,a„) time, for a continuous function /. We consider 
a subclass C of inputs of 0/1-KPr determined by the constraints 

1 2 

-< at < - for 1 < i < n. 

3 3 “ “ 

Let us denote 

C = {o I I < Oi < |, 1 < i < n} , 

Cyes = {a I o G C, 3a; G (0, 1}” : a^x = l} , 

Cno = C' \ Cyes- 

For any problem input from the considered subclass the value /(oi, . . . , a„) 
is bounded by a constant, and thus the time complexity of the algorithm A 
reduces to o(n log n). 

On the other hand, according to the Ben-Or’s theorem 0 , in the considered 
computational model a lower bound l7(log#c.c. (C„o)) holds for the complexity 
of any algorithm solving 0/1-KPr for inputs from Cyes, where #c.c.(C„o) is the 
number of connected components of Cno- We will show that ffc.c. (Cno) = nl, 

1. e., log#c.c. (Cno) = logn! = 0(nlogn). This will contradict the o(nlogn) time 
complexity of the algorithm A on inputs from C. Thus, to complete the proof it 
suffices to prove the following lemma. 

Lemma 7. The set Cno has exactly n! connected components. 

Proof First of all, let us observe that a G Cyes if and only if 3i,j i ^ j such 
that Ui Qj = 1. For every a G Cno we denote 

Sa — {{^2, Uj } I ^ J, n* -f Oj ^ 1} . 
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As a first step of the proof we will show that Cno has as many connected com- 
ponents as the number of all distinct sets Sa, a £ Cno- 

First we show that if for some a', a" £ Cno the condition Sa> = Sa" holds, 
then a' and a" belong to the same connected component of Cno- Clearly, Cno = 
{a \ a £ C, 'ix £ {0, 1}” : yf l}. Since Sa> = Sa", for every i,j i ^ j both 

a' -I- a' — 1 and a" -h a" — 1 have the same sign, either positive or negative but 
not zero. The same is the sign of at -£ aj — 1, where a = (oi, . . . , a„) is any 
affine combination of a' and a". Thus a £ Cno, so that the whole segment with 
endpoints a' and a" lies in Cno- This implies that these two points belong to the 
same connected component. 

Now we prove that if a' , a" £ Cno are in the same connected component, then 
Sa' = Sa"- Assume the opposite, i.e., that there are a', a" G Cno belonging to the 
same connected component D, while Sa' yf Sa"- Then there are i,ji^ j such 
that a' -I- a' < 1 and a" + a" >1. Let £ be a continuous curve with end-points 
a' and a", which is contained in D. We define a function h{x) = Xi + Xj, where 
Xi, Xj are, respectively, the tth and jth component of x £ R”. Let us consider 
the restriction of h{x) on the curve £. We have h{a') = a' -I- a' < 1 < a" -£ a" = 
h{a"). Since h is continuous, there must be a point a on the curve £ for which 
h{a) = ai + ttj = 1. But a £ C C D C Cno and therefore Oi -I- aj < 1, which is a 
contradiction. 

Thus it only remains to count all distinct sets Sa, a £ Cno- We will use 
induction on the dimension n. As the basis of the induction, for n = 2 we have 
two such distinct sets, namely 0 and {{ 01 , 02 }}. Suppose that the thesis is true 
for dimension n — 1, and take an arbitrary set Sa, where a = (oi, . . . ,o„_i) is 
an (n — l)-dimensional vector. Without loss of generality we assume that the 
coordinates of a are all distinct, and consider them in an increasing order: 

1 2 
— < ^ ^ . . . ^ ^in — l ^ 2 ' 

To pass to dimension n, we need to add one more coordinate o„ to the vector o. 
One can choose o„ in n different ways: 

i Qi-^ \ dji \ 2? 

f-ai. < o„ < 1 - for 2 < j < n - 1, 

Thus we get n distinct sets “generated” by Sa- 

Sa U {{a^^,an} \ l<k<j}, for 0 < j < n - 1. 

Note also that two sets generated by sets Sa' yf Sa" are distinct because their 
maximum subsets of (unordered) pairs not containing a„ are (i.e., the sets Sa' 
and Sa", themselves). 

Thus we have proved that the number of connected components of Cno in- 
creases by a factor of n when passing from dimension n — 1 to dimension n. 
Hence, this number is exactly n!, as claimed. □ 
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Remark 2. The Ben-Or’s theorem states a lower bound J7(log #c.c.(C„o)) on the 
complexity of any algorithm solving 0/1 -KPr. Note that, in the general case, 
when no restrictions on the coefficients Oi, 02 , . . . , a„ are imposed, the number 
of connected components of Cno is 2^*^" \ Hence, the lower bound I7(n^) holds. 
This larger number 2^^” ^ (compared to the number n! of connected components 
of the class C) is due to the inputs in which the a^’s are “close to 0 or 1.” These 
inputs are excluded from C. Therefore, the Ben-Or’s bound Q{ri?) does not 
imply directly ours, although we have used his theorem, together with Lemma 
Q to obtain ours. 

Remark 3. To obtain the complexity result of Theorem 0 we have used the 
class of instances C = {a||<Oi<|, l<t< n}. It is easy to see that an 
0(n log n) algorithm exists for this particular class. To show this, first we sort 
in 0(n log n) time the coefficients of = 1. After an appropriate substitution 
and enumeration of the variables we obtain an equation with coefficients a\ < 
02 < ... < o„ . As already observed, a solution to a^x = 1 exists if and only if 

j i yf j such that ai + aj = 1. In order to check whether this condition is met, 
we search the sorted array of coefficients in linear time, as follows. Set i = 1, 
j = n. If Oi -k Qj < I, then set i := i + 1] li ai + aj > 1, then set j := j — 1. If 
ai + a j = 1 or i = j , then stop. In the former case the equation has a solution 
(namely, Xt = 1, Xj = 1, Xfc = 0 for fc f,j). In the latter case a solution 
does not exist. The complexity of this procedure is 0{n), and thus the overall 
complexity of the algorithm is 0(n log n). The proof of Theorem El demonstrates 
that for the class C this algorithm is, in fact, optimal. 

Clearly, an analogous result holds for the complexity of the classical Boolean 
knapsack problem (0/1-KP) a^x = h with integer coefficients. This problem is 
equivalent to the equation x = 1 with a = |o, to which Theorem 0 applies. 
Thus, we have lower time complexity bounds, both for the Boolean knapsack 
problem with real coefficients and the classical formulation, and these bounds 
are independent of the known lower bound I7(n^). 

One can show that the result of Theorem 0 is still valid for the integer 
knapsack problem KPr. Unlike the Boolean case, here an input a belongs to 
Cyes not only if = 1 for some indices i, j, but also \i ak = \ for some 

index k. Accordingly, the set Sa is modified as 

Sa — Uy } I ai “t“ Qj ^ 1} . 

Note in the definition above that we allow i = j. This adds to the set Sa every 
singleton {ak} such that ak = 1/2. One can show that Cno has at least as many 
connected components as the number of the distinct sets Sa, a € Cno- Then by 
induction on n one obtains that the set Cno has at least n\ connected components, 
from where the result follows. The proof is a straightforward modification of the 
one of Theorem Hand therefore is omitted. 

As a last comment of this section we mention that the Ben-Or’s lower bound, 
as well as the negative complexity result of Theorem 0 suggest seeking certain 
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approximation solutions to the knapsack problem. For instance, one can look 
for a Boolean vector that is “enough close” to the hyperplane determined by 
a^x = 1 (e.g., minimizing \a^x — 1| within a given tolerance e > 0). We notice 
that the best of the existing algorithms (see, e.g., for this approximation 
problem are indeed linear with respect to the dimension n. 

5 Concluding Remarks 

We have presented an 0(77ilog ||d||) algorithm for integer linear programming 
with real coefficients and fixed number of variables, within the Blum-Shub-Smale 
computational model. A further task would be to show that this complexity 
bound is tight. 

We have also obtained a lower bound for the complexity of the real knapsack 
problem. In view of this result, it would be interesting to know if there is an 
algorithm for 0/1 -KPr with time complexity 0{n^~^ f{a)), <5 > 0. 

Some of the obtained results (e.g.. Corollary 1) show that the integer pro- 
gramming problems are, in general, intractable in the framework of the com- 
plexity theory over the reals, since their complexity cannot be bounded by any 
polynomial in the input size, the latter being a polynomial only in m and n. Some 
further refinements of the theory suggest, however, that these problems can be 
considered as efficiently solvable. Following Smale Ea, a numerical algorithm 
can be considered as efficient only if its complexity is bounded by a polynomial 
in the problem dimensions and the logarithm of its weight. The weight function 
is defined in accordance with the problem specificity and used to measure the 
difficulty of a problem instance. Under such a convention, let us define the weight 
of ILPr as a number ||d|| which bounds the norms of the admissible solutions. 
Then the results of Section 3 imply that ILPr and KPr are efficiently solvable 
in the above sense. 
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Abstract. We perform a convergence analysis of simulated annealing 
for the special case of logarithmic cooling schedules. For this class of 
simulated annealing algorithms, B. Hajek Pj proved that the conver- 
gence to optimum solutions requires the lower bound r/\n{k -\-2) on 
the cooling schedule, where k is the number of transitions and F denotes 
the maximum value of the escape depth from local minima. Let n be 
a uniform upper bound for the number of neighbours in the underlying 
configuration space. Under some natural assumptions, we prove the fol- 
lowing convergence rate: After k > + log°^^^ (1/s) transitions the 

probability to be in an optimum solution is larger than (1 — e). The result 
can be applied, for example, to the average case analysis of stochastic 
algorithms when estimations of the corresponding values F are known. 



1 Introduction 

Simulated annealing was introduced independently by Kirkpatrick et al. 0 
and V. Cerny as a new class of algorithms computing approximate solutions 
of combinatorial optimisation problems. The general approach itself is derived 
from Metropolis’ method m to calculate equilibrium states for substances 
consisting of interacting molecules. 

Simulated annealing algorithms can be distinguished by the method that 
determines how the temperature is lowered at different times of the compu- 
tation process (the so-called cooling schedule). When the temperature is kept 
constant for a (large) number of steps, one can associate a homogeneous Mar- 
kov chain with this type of computation. Under some natural assumptions, the 
probabilities of configurations tend to the Boltzmann distribution in the case of 
homogeneous Markov chains (see 1 1, 'll 1 4lf bj V This class of simulated annealing 
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algorithms has been studied intensely and numerous heuristics have been derived 
for a wide variety of combinatorial optimisation problems (see EEcmaini). 

When the temperature is lowered at any step, the probabilities of configu- 
rations are computed from an inhomogeneous Markov chain. The special case 
of logarithmic cooling schedules has been investigated in [11415 17| . B. Hajek |7] 
proved that logarithmic simulated annealing tends to an optimum solution if and 
only if the cooling schedule is lower bounded by T/ In (A: -|- 2), where F is the 
maximum value of the escape depth from local minima of the underlying energy 
landscape. O. Catoni obtained a convergence rate oi M ■ k where M 
depends on the number of configurations with a certain value of the objective 
function and a is related to the parameter F. Since structural parameters are 
treated as constants, the estimation does not provide a time bound that depends 
on F and the confidence parameter e only. 

Given the configuration space T, let denote the probability to be in 

configuration / after k steps of an inhomogeneous Markov chain. The problem is 
to find a lower bound for k such that ^/(^) > 1 — e for / G .T^min C T 

minimising the objective function. By n we denote a uniform upper bound for 
the number of neighbours of configurations f G F. 

We obtain a run-time of + log*^*-^^ 1/e that is sufficient to approach 

with probability 1 — e the minimum value of the objective function. The present 
approach is based on a very detailed analysis of transition probabilities and has 
been developed in the context of equilibrium computations for specific physical 
systems 0. 



2 Preliminaries 

Simulated annealing algorithms are acting within a configuration space in ac- 
cordance with a certain neighbourhood relation, where the particular transiti- 
ons between adjacent elements of the configuration space are governed by an 
objective function. In the following section we introduce these notations for our 
problem setting. 

The configuration space is finite and denoted by T . We assume an objective 
function Z : T — >■ IN that for simplicity takes it values from the set of integers. 
By A/f we denote the set of neighbours of /, including / itself. 

We assume that F is reversible: Any transition / — >• /', /' G A//, can be 
performed in the reverse direction, i.e., / G A//'. We set n := max/g^F | A/f |. 

The set of minimal elements (optimum solutions) is defined by 

^min :={f -.yf if GF^ Z{f) < Z{f))}. 

Example We consider Boolean conjunctive terms defined on n variables. The con- 
junctions are of length [logn] and have to be guessed from negative examples and 
one positive example a. Hence, F consists of G = and all subterms t that can 
be obtained by deleting a literal from x'^ and that reject all negative examples. Each 
neighbourhood A/t contains at most n' < n -I- 1 elements, since deleted literals can be 
included again. The configuration space is therefore reversible. The objective function 
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is given by the length of terms t and in this case the optimum in known and equal to 
[logn]. 

In simulated annealing, the transitions between neighbouring elements are 
depending on the objective function Z. Given a pair of configurations [f,f], 
/' S A//, we denote by G[f, /'] the probability of generating /' from / and 
by A[f, /'] the probability of accepting /' once it has been generated from /. 
Since we consider a single step of transitions, the value of G[f, /'] depends on 
the set Aff. As in most applications of simulated annealing, we take a uniform 
probability which is given by 



( 1 ) 



G[f, f] := 



1 

W\' 



The acceptance probabilities A[f, /'],/'€ C A" are derived from the underlying 
analogy to thermodynamic systems EEH]: 



( 2 ) 



mf] 




Z(/')-Z(f) 

C 



, otherwise. 



where c is a control parameter having the interpretation of a temperature in 
annealing procedures. 

Finally, the probability of performing the transition between / and f' is 
defined by 

(G[f,f]-A[f,n 

(3) Pr{/ G[f, f] • A[f, g], otherwise. 

I f'Af 

By definition, the probability Pr{/ — >■ /'} depends on the control parameter c. 

Let Sifik) denote the probability of being in the configuration / after k steps 
performed for the same value of c. The probability ay(fc) can be calculated in 
accordance with 



(4) af{k) := ah(k - 1) • Pr{h /}. 

h 



The recursive application of @ defines a Markov chain of probabilities ay(fc), 
where f € J- and k = 1, 2, .... If the parameter c = c{k) is a constant c, the 
chain is said to be a homogeneous Markov chain; otherwise, if c{k) is lowered at 
any step, the sequence of probability vectors a(k) is an inhomogeneous Markov 
chain. 

In the present paper we are focusing on a special type of inhomogeneous 
Markov chains where the value c(k) changes in accordance with 



( 5 ) 



c{k) 



r 

ln(fc + 2) ’ 



A: = 0, 1, ... . 



The choice of c(fc) is motivated by Hajek’s Theorem 0 on logarithmic cooling 
schedules for inhomogeneous Markov chains. To explain Hajek’s result, we first 
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need to introduce some parameters characterising local minima of the objective 
function: 

Definition 1 A configuration f' € J- is said to be reachable at height h from 
f € T, i/3/o, /i, /r e ^(/o = / A /r = /') such that G[/„, fu+i] > 
0, u = 0, 1, ... , (r — 1) and Z{fu) < h, for all w = 0, 1, ... , r. 

We use the notation height{f ^ f) < h for this property. The function / is a 
local minimum, if f € iF\ .T^nin and Z{f) > Z{f) for all /' S Nf \ f. 



Definition 2 Let gmin denote a local minimum, then depth{g-cain) denotes the 
smallest h such that there exists a g' € iF, where Z{g') < Z{g^,^), which is 
reachable at height 2^((7min) + h. 



The following convergence property has been proved by B. Hajek: 



Theorem 1 [7] Given a cooling schedule defined by 

r 



z{k) = 



ln(fc + 2) 



fc = 0, 1, ... . 



the asymptotic convergence SLf(k) — ► f of the stochastic algorithm that is 






k—¥oo 



based on m and is guaranteed if and only if 



(i) V5, g' G T3go, gi, , g^ ^ T{go = g A gr = g'): 

G[gu, gu+i] > 0, u = 0, 1, ... , (r - 1); 

(a) y h : height{f ^ f) < h 4=4> height{f' => /) < h; 

(Hi) r > max depth(g^in). 

9min 

In the following, we assume that the conditions (i), ... , (iii) of Hajek’s Theorem 
are satisfied for our configurations space T . Furthermore, we set the upper bound 
|.7^| < exp(n‘^^^^) which is common in combinatorial optimisation. Additionally, 
we assume that the difference of the objective function is the same between 
neighbouring elements, like in the case of our Example. 



3 Convergence Analysis 

Our convergence result is derived from a careful analysis of the “exchange of pro- 
babilities” among configurations that belong to adjacent distance levels, where 
the distance is measured by the minimum number of steps to reach an element of 
•?Gnin- Therefore, in addition to the value of the objective function, the elements 
of the configuration space are further distinguished by their distance to T-aVva- 
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We introduce the following partition of the set of configurations with respect 



to the value 

Lq • — ^ rnin 

The highest 



of the objective function: 
and Lh+i :={/:/ £ Vf (/' G T\ \J L, 

i—0 

level within T is denoted by and we set 



2(f) > 2(/)}. 



f/) :=|{f : f'GMfAZif') > 2(/)}|, 

q{f) ■■ f G Mf A Z{f') = 2 (/)}|, 

r{f) :=|{f : f G A {Z{f) < Z{f))}\ 



As will be seen later, the q{f) neighbors with the same value of the objective 
function can be treated in the same way as / itself (we have q{f) = 0 in our 
Example of Boolean conjunctions). Therefore, we will not distinguish between / 
and these neighbors, see Lemma |2| and the following equations. The introduced 
notations imply 



(6) s{f) + q{f) + r{f) = \Mf\ -1. 



We consider the probability a.f{k) to be in the configuration f G T after k 
steps of an inhomogeneous Markov chain, i.e., the “temperature” is defined in 
accordance with 0 . We observe that 



( 7 ) 



1 



Z(f)-Z(/') 

= e ■=(*’> , fc > 0. 



To simplify notations we take a new objective function with the same notation: 
Z{f) :=Z{f)/r. 

By using 0 till (0, one obtains from 0 by straightforward calculations 
a/(^) = a/'(^-l)-P>^c(fc){f ^/} 

= a/(fc- 1) •Prc(fe){/^ /}+ a/'(A:-l)-Prc(fc){f -^/} 

f 

= affc-l).(l- ^ + 

f ytf 

+ ^ ay,(A:-l)-Pr,(fc){f ^/}. 
f A f 



We employ 0, and the resulting equation is given by 

llAfI I AT/ I 

zUi) > z(f) 



^ -a/(fc- 1) + 
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s(/)+l(/) 

+ E 

i = 1 

Z{fi) > Z(f) 
r{f) 

+ E 

j = 1 

Z(f') < ZU) 



1 ) 

lAT/J 



I 



1 



This representation (expansion) will be used in the following as the main relation 
reducing a/(fc) to probabilities from previous steps. 

Given f G T, we consider a shortest path of length dist(/) from / to 
i.e., /' S We introduce a partition of T with respect to dist(/): 



f G Ml dist(/) = i > 1, and = |J M^, 

1=0 

where Mq := Lq = Thus, we distinguish between distance levels Mi related 

to the minimal number of transitions required to reach an element of and 
the levels Lh that are defined by the objective function. 

We suppose that k > s, and we are going backwards from the step. Our 
aim is to reduce the value 



E 

f ^ Mo 



to the sum of probabilities from previous steps. First, we consider the expansion 
of only a particular probability ay(fc) as shown in (EJ). At the same step k, the 
neighbors of / are generating terms containing B.f(k — 1) as a factor, in the 
same way, as a.f{k) generates terms with factors aj. (fc — 1) and aj^. (A: — 1). Since 
we consider the entire sum terms corresponding to a particular 

a.f{k— 1) can be collected together in the expansion (0) to form a single term. 
Therefore, we obtain 



(9) ay(fc-l). 



A^/ I -r{f) 



s(/)+9(/) 



Afi 



f 



- E 



1 



Aff 



i = 1 ' ^ ' (^ + l) 

Z(fi) > Z(f) 



z(U)-z(f) 



Hf) 

^ (* + l) 

Z{fi) > Z(f) 

= a/(fc- !)■ 



1 



Z(fi)-Z(f) 



r(f) 

E 

j = 1 

Z{fi) < Z{f) 



\Aff 



However, it is important to distinguish between elements from Mi and elements 
from Mi, i > 2: If / G Mi, the neighbors from Mg do not generate terms 
containing probabilities from higher levels because the a.fi(k—i) are not expanded 
during the particular steps of the recursive procedure since they are not present 
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in the sum X)/^Mo ^/(^)- If / ^ -^ii fh® terms related to Z{fj) < Z{f), where 
fj G Mq, are missing, and therefore the following arithmetic term is generated: 

(10) (l-^).a,(t-D, 

where r'(/) < r(/) is the number of neighbors from Mq. We introduce the 
following abbreviations: 



‘P{f,f,v) := 



\^f\ k i-^/i ■ 



The relations expressed in (0 till m can be summarized to 
Lemma 1 A single step of the expansion o/ X)/^M o ^/(^) t'esttfe in 

r'if) 






f ^ Mo 



Mo 



f e Ml 






■a/(^-l) + 



r'(/) 



+ E E 

f £ Ml j = I 

fj G Mo n TV"/ 



.a, ik-V) 

lATrJ 



Sketch of Proof: We obtain from (|2|) and (II 1 )ll : 

E = E + E 

f ^ Mo f ^ Mo U Ml f e Ml 



E 


afik-l) 


f Mo U Ml 


r'(f) 


+ E 


E 


/ G Ml 


i = 1 


fj 


G Mo n TV"/ 



- E 

f e Ml 



r'if) \ 

\Aff\) 



|Af/, I 



•a/.(fc- 1). 



■a/(^-l) + 



The sum containing ip{f,fj,l)/ \ Afj. \ is generated from aj(fc), / G Mi, in 
accordance with The negative product a/(fc — 1) • r(/)/ | A// | is taken 
separately, and one obtains the stated equation. q.e.d. 

The diminishing factor (l — r(/)/ |A// |) appears by definition for all elements 
of Ml. At subsequent reduction steps, the factor is “transmitted” successively 
to all probabilities from higher distance levels Alj because any element of Mj 
has at least one neighbor from Mi_i. The main task is now to analyze how this 
diminishing factor changes, if it is transmitted to the next higher distance level. 
We denote 



(11) E ^/(^) = E t^if^r) ■ af{k-v) + p.if',v)-apik-v), 

f i Mo f i Mo f & Mo 



i.e., the coefficients fj,{f,v) are the factors at probabilities after v steps of an 
expansion of E/^Mo^/(^)- ^ have /r(/, 1) = 1 - r'if)/ \ Nf |, 
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and /r(/, 1) 
implies: 

( 12 ) 



1 for the remaining / G A4s\{Mo U Mi). For /' G Mg, Lemma E 



M/M) 



s(/') 

E 

2—1 

fi € Min Aff, 



IM'I 



Starting from step (fc — 1), the generated probabilities a.fi{k — u) are expanded 
in the same way as all other probabilities. 

Except for the initial values, the coefficients ^ Mq, and 

f G Mq, can be treated in the same way because the expansion (0 is valid for 
both B.f(k) and a//(fc). Therefore, we can apply the same considerations which 
led to equation (0 to /r(/, v) ■ af{k — v) and /r(/', v) ■ aj/{k — v) from (Tm . We 
perform an inductive step from (k — v + 1) to {k — v) and we suppose that all 
terms within | ... } of Q are multiplied by the corresponding — 1): 



f{k-v) ■ \ 



A/'/I -K/) _ V + 

I A/> I I A/> 



-{zUi)-z(f)). 



»(/)+?(/) 

' E 

i ^ 1 
fi>f 



1 ) 

lATr I 



1 



(fc+1) 



zUi)-zU) 



fi> f 
r{f) 

+ E 

j = 1 
fj < f 



IMI 



= af{k - v) ■ ^i{f,v). 

We obtain for r; > 2: 

/'</ r>/ 



It is important to note that the summands are divided by the same value \Afj |, 
see (0. We set := l — v{f,j) because we are interested in the convergence 

of both values and v{f,j) to zero. Substituting ^(/, j) by 1 — v{f, j) and 

rearranging both sides leads to the same structure of the equation as for /r(/): 

Lemma 2 The following recurrent relation is valid for the coefficients v(f,v): 



v{f, v) = v{f,v 
Furthermore, 



1) ■ Df{k - v) 

/'</ 



IMI 



+E 

/">/ 



\Mf\ 



TifJiu)- 



0, 




*//e 


M„ 


j > *; 


r(f) 










IMI’ 




*//e 


Ml, 


1 = 1; 


1 

1 — 1 


TifiJA) 

IMI ’ 


*//e 


Mo, 


i = 1. 


/i G Ml n A/f 
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We note that the representation from Lemma|2is a linear function with res- 
pect to the values v{f, v), i.e., once an arithmetic term / has been added (or ge- 
nerated) by the previous value of v{f, u— 1) or “neighbouring” values f — 1), 
this term is multiplied by factors that only depend either on structural properties 
of the neighborhood relation (that is the factor (s(/) -1-1)/ | A/f | from Df) or 
on differences of the objective function which are expressed by ip(f , f,v). The 
differences Z{f') — Z{f) do not change during the recursive calculation of values 
v{f,v)\ the only parameter that changes in ip{f' , f,v) is v. Therefore, any v{f,v) 
and /i(// v) can be expressed by a sum of arithmetic terms. 

We consider in more details the terms associated with elements of Mq and 
Ml. We assume a representation — = J2u' '^u' ^(/w~l) = 

Since there are no /' < /' for /' G Mq, we obtain: 



1 - EuTuif) 



m(/', v) = Df,{k-v)-J2 T'u' if) +E • ^if^ /'> f 



/>/' 






/>/' 



/>/' 



J\ff> 



As can be seen, the values X)/>/' I A//' | are generated at each time 

step with the corresponding f , v). The remaining summands of /.t(// v) are 

individual arithmetic terms /„ and /(,/ from the previous step, multiplied by a 
positive or a negative factor. 

In the same way we obtain for elements of Mi: 



ff,v) 



rjf) 

lAT/l 






1) ■Df{k-v) + E 

r>f 



I AT/ I 



-E 

/' < / 
/' 6 Mo 



iiu'T'An 

\^f\ 



vif',f,v) - 



where r(/)/ | A// | is from (1 — (s(/) -I- 1)/ | A// |). Again, the sign changes 
only for terms fu,{f) because Df{k — v) > 0. The term r(/)/ | A// | appears 
in all recursive equations of v(f,v), f G Mi and r; > 1, and the same is valid 
for the value J2f>f‘Piff^f/ I -E' I k-if^f^ f ^ Mq. Therefore, all 

arithmetic terms / are derived from expressions of the type r{f)/ \ A/ | and 
J2f>f V’ifi // '*^)/ I A/f' I . This justifies the following 

Definition 3 The expressions r{f)/ \ A/ |, / G Mi, and X)/>/' V^(/i /^ ^)/ 
I A/f' I, /' G Mq, are called source terms of v(f,v) and pi{f',v), respectively. 

During an expansion of X]/^Mo^/(^) backwards according to 0, the source 
terms are distributed permanently to higher distance levels Mj. That means, in 
the same way as for Mi, the calculation of vifjv) is repeated almost identically 
at any step, only the “history” of generations becomes longer. Therefore, at 
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higher distance levels the notion of a source term can be defined by an inductive 
step: 

Definition 4 // / G Mi, i > 1, any term that is generated according to the 
equation of Lemma\^from a source term ofv{f,v— 1), where f € 
is said to be a source term ofv{f,v). 

We introduce a counter r(T) to terms / that indicates the step at which the 
term has been generated from source terms. The value r(T) is called the rate of 
a term and we set r(T) = 1 for source terms /. 

The value r(T) > 1 is assigned to terms related to Mq and M\ in a slightly 
different way compared to higher distance levels because at the first step the 
f S Mo do not participate in the expansion of Furthermore, 

in the case of Mq and Mi we have to take into account the changing signs of 
terms that result from the simultaneous consideration of v'{f,v) (for Mi) and 
yi{f',v) (for Mo). Basically, a rate r(T) > 1 is assigned to a term that was at 
the preceding step an (r(T) — l)-term or (r(T) — 2)-term at the same or a higher 
distance level, respectively, or an r(T)-term at a lower distance level. 

Definition 5 A term f is called a rate term of p,(/',u), /' € Mq and 
j > 2, if either f = —T and r(T) = 7 — 1 for some v(f,v — 1), / G Mi DAff, 

iV 0. /• e A/„ nv,. 

A term f is called a rate term ofv{f, v), f G Mi and j > 2, ifv{T) = j — 2 
for some v{f , u — 1), / G M2C\Mf, r(T) = j — 1 for some v{f , u — 1), / G MiflA//, 
or f = —T' and r{T') = j—l for some /' G ModAff with respect to yi{f' ,v — l). 

A term f is called a rate term of v{f,v), f G Mi and i, j > 2, if 
r(T) = J — 1 for some v{f',v — 1), /' G M^+i H A/j, r(T) = j — I for some 
n{f',v — 1), /' G Mi A Af p or f is a rate term of v{f,v — 1) for some 
f G Mi_i. 

The classification of terms will be used for a partition of the summation over all 
terms which constitute particular values v{f,v) and /i(/',u). Let Tj{f,v) be the 
set of rate arithmetic terms of v{f, v) {p-if', v)) related to / G A4g- We set 

(13) S, (/>):= ^ T. 

feTj{f,v) 

The same notation is used in case of / = /' G Mq with respect to p{f, v). Now, 
the coefficients v{f,v), p{f',v), can be represented by 

(14) i^{f,v)= Sj(/,v) and p{f',v) = ^Sj(/',u). 

7=1 7=1 

We compare the computation of iy{f, v) (and /r(/^ v)) for two different values v = 
ki and v = k 2 , i.e., iy{f,v) is calculated backwards from ki and fc 2 , respectively. 
Let Sj and S| denote the corresponding sums of terms related to two different 
starting steps ki and ^ 2 , respectively. Since the source terms depend on v only. 
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see Definition 13 we obtain from the simple relation fc2 — (fc2 — k\ + v) = k\ — v 
the following equality: 

Lemma 3 Given /c2 > k\ and 1 < j < fci, then 

S]U,v) = S]U,k 2 -k,+v). 

We recall that our main goal is to upper bound the sum J 2 f^Mo When 

a( 0 ) denotes the initial probability distribution, we can derive 

( 15 ) 1 ^ a/(A:2)- ^ a/(fci)|< ^ | (;/(/, fc2) - i^(/, fci)) • a/ ( 0 ) | + 

f^Mo f^Mo f^Ma 

+Y 1 l(M(/'>^2 )-M(/',fci)) -a/'( 0 )| . 

/'GMo 

Any element of the initial distribution a/( 0 ) is simply replaced by 1 and we 
obtain for the first part of the sum in accordance with Lemma 01 

( 16 ) l(^(/>^ 2 )-i^(/,fci)) -a/( 0 )| < ^ |Sj(/,fc 2 )|. 

f^Mo f^Moj=ki + l 



For / G Mi yf Ml, Mo and j >2 we obtain immediately from Lemma | 3 and 
Definition El 

(17) Sj{f,v) = Sj_i{f,v-l)- Dji{k-v)+ ^ + 

/' < / 

/' G Mi+i 



f" > f ^ 

f" G Mi+i 



f <f ^ 

/' G Mi_i 



+ E 



Si(/',^-l) 



+ E 



Si(r,i^-i) 



/' < / 
/' G Mi_i 



■E I > f I E 






f G Mi_i 

In case of / G Mi and j > 2 , we have in accordance with Definition El 



(18) Sj{f,v) = Sj-i{f,v - 1) -Df{k-v) + 

+ 11 1 : 



f > f 

]" G M2 



/'GMonW/ 



Si--i(/',^-l) 

lATfl 



Finally, the corresponding relation for /' G A/q is given by 

(19) Sj(/',u) = Sy_i 1) • D//(fc-r;) - E ^^~4//'^l ' V^(/’ 

/>/' I -^ 1 ' I 
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The recursive application of till ifTTl generates a tree-like structure 
T{f,j,v), where a particular Sj{f,v) represents the root of the tree. The leaves 
are source terms from Mi and Mg after a full recursion of v steps. 

Positive and negative terms are considered separately, and a uniform upper 
bound of I Sj (/, ^ 2 ) | is calculated from the leaves to the root. Here, we distinguish 
between M\ and Mg. The terms from Mg are multiplied by an extra factor 
ip{fi,f,l)/ \Aff I, and therefore it is sufficient to consider the terms generated 
by elements of Mi. 

Since any element of Mi has a neighbour from Mg and the terms arising 
from the lowest level are treated separately, the terms from Mg (i.e., generated 
by / itself and neighbours A// fl Mg) are multiplied by a factor smaller than 
1. At each step towards the root, we take the values from elements of Mi as 
a new summand to the terms generated at higher levels Mi, i > 1. Therefore, 
the already existing summands are also multiplied by a factor smaller than 1, 
when summands from neighbouring elements with the same generation step are 
taken together. The factor is determined by D' := Df — {si-i + ri-i)ln, when 
n denotes for short an upper bound for | Mf \ . 

Hence, the analysis of terms can be reduced to the terms (new summands) 
generated at level Mg. The particular products (summands) can be classified 
by the number a of self-transitions (i.e., factors Df(k — u)), the number b of 
transitions to higher levels (factors <p(/^/, u)), and c steps to the same level or 
lower levels of the objective function. Thus, we have a + b + c = v for the three 
types of transitions. For fixed a and b, the products can be upper bounded by 



( 20 ) 



V — a 

2-b 



2-h 

b 



n( 



max 



(n)2 {k-\-2- UiY 






i=i 



where U < (n — 1)^ and to denotes the uniform difference between neighbouring 
elements (see the remark after Theorem ^1. We employ the Stirling formula 
for the following estimation: 



( 21 ) Yi^fifuJu^Vu) <Y[ 



U—1 



u=l (^ + 2 - Vu) 



< 



[biy 



< 






The product (E0» is analysed for both cases a < vj2 and a > vj2 separately. In 
the first case, v > 2*^^^^ is sufficient to obtain an exponentially small bound for 
(Enj. In the second case, we need k > for the factors fi,Vu) to be 

sufficiently small in order to compensate the binomial coefficients. 

The analysis of E]|) leads to a uniform upper bound of | Sj{f, ^ 2 ) | and we 
obtain 

Lemma 4 Given the maximum escape depth F , then k 2 > k\ > 

implies 

I XI (^(■^’^2) - i^(/,fci)) -a/( 0 )| < fc2 ^ ^ • 2 ^1 

f^Mo 
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In the same way, the corresponding upper bound for ki) — p.(/', /c 2 )) is 

derived. 

By our assumption about the size of T (see Section 2), a value of k 2 < 
allows a complete search of T. We assume that such an upper bound is sufficient 
for a convergence according to Theorem ^ Now, we can immediately prove 

Theorem 2 When the convergence is guaranteed for k2 < then k > 

rpir) -|_log‘^(^^ 3/e implies for arbitrary initial probability distributions a(0) and 
£ > 0 the relation 

Sij{k) < £ and therefore, ^f'{k) > 1 — e. 

f^Mo f'£Mo 

Sketch of Proof: We use the following representation: 

f^Mo f^Mo f^Mo 

= - HLk)) •a^~(0) + 

f^Mo 

+ “ m(/'>^ 2)) -a/-(0) + aj(fc2). 

f'£Mo f^Mo 

The value k2 from Lemma 0 is larger but independent of fci = k, i.e., we can 
take a k2 > k such that 

E 

f^Mo 



We employ Theorem QJ i.e., if the constant F from © is sufficiently large, the 
inhomogeneous simulated annealing procedure defined by m, &, and m tends 
to the global minimum of Z on T . We obtain the stated inequality, if additionally 
both differences E f^Mo (^(/> ^2) - v(f ,k)) and E/'sMo “ Kf,k2)) 

are smaller than e/3. Lemma E| implies that the condition on the differences is 
satisfied in case of 



2- 




which is valid, if log"’'(3/e) < k. 



q.e.d. 



4 Concluding Remarks 

We analysed the convergence of logarithmic cooling schedules in terms of the 
underlying energy landscape. Under natural assumption about the configuration 
space we succeeded to prove a lower bound on the number of cooling steps that 
are sufficient to approach with with probability 1— £ the set of optimum solutions. 
The lower bound is given by + log‘^*-^^(l/£), where F is the maximum 

escape depth from local minima and n describes the size of the neighbourhood 
of configurations. When F can be calculated or estimated, the result can provide 
good or improved upper bounds for the time complexity of stochastic algorithms. 
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Abstract. Hierarchical state machines are finite state machines whose 
states themselves can be other state machines. Hierarchy is a useful con- 
struct in many modeling formalisms and tools for software design, requi- 
rements and testing. We summarize recent work on hierarchical state ma- 
chines with or without concurrency. We discuss issues of expressiveness 
and succinctness, the complexity of basic operations (such as emptiness, 
equivalence etc.), and algorithmic problems in the analysis of hierarchical 
machines to check for their properties. 



1 Introduction 

Finite state machines constitute one of the most fundamental modeling mecha- 
nisms in Computer Science. They have been widely used to model systems in a 
wide variety of areas, including sequential circuits, event-driven software, com- 
munication protocols and many others. In its most basic form, a finite state 
machine consists of a directed graph whose nodes represent system states and 
edges correspond to system transitions. When equipped with an input (and/or 
output) alphabet, FSM’s become language recognizers defining the regular lan- 
guages (or transducers). A rich theory of FSM’s had been developed since the 
50’s regarding their expressive power, their operations and the analysis of their 
properties. 

In practice, several extensions are useful to enhance the expressive power and 
to describe more complex systems more efficiently. Two popular extensions that 
are fairly well understood are the addition of variables to form so-called extended 
finite state machines, and the addition of concurrency (communication) to form 
communicating finite state machines. A third useful extension that has been 
less well studied is the addition of a hierachical (nesting) capability, to form 
hierarchical state machines, i.e., machines whose nodes can be ordinary states 
or superstates which are FSM’s themselves. This is a very useful construct to 
structure and represent large systems. 

The notion of hierarchical FSMs was popularized by the introduction of Sta- 
TECHARTS |H^ . and exists in many related specification formalisms such as 
Modecharts and Rsml |LHHH,941 . It is a central component of va- 

rious object-oriented software development methodologies developed in recent 
years, such as Omt [IR BPK9l| . Room mm, and the Unified Modeling Lan- 
guage (Uml mm)- This capability is commonly available also in commer- 
cial computer-aided software engineering tools that are coming out, such as 

J. van Leeuwen et al. (Eds.): IFIP TCS2000, LNCS 1872, pp. 315-^^^ 2000. 
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Statemate (by i-Logix), ObjecTime Developer (by ObjecTime), and Ra- 
tionalRose (by Rational) (the last two are now combined into RationalRose- 
RealTime). 

The nesting capability is useful also in formalisms and tools for the require- 
ments and testing phases of the software development cycle. On the requirements 
side, it is used to specify scenarios (or use cases IM) in a structured man- 
ner. For instance, the new ITU standard Z.120 (MSC’96) for message sequence 
charts formalizes scenarios of distributed systems in terms of hierar- 

chical graphs built from basic MSC’s. The Lucent uBET toolset for behavioral 
requirements engineering uses a similar formalism and models requirements by 
hierarchical message sequence charts (uBET stands for Lucent Behavioral En- 
gineering Toolset and is based on the MSC/POGA prototype tools |AHR96l 

EEESZ!). 

On the testing side, FSMs are used often to model systems for the purpose of 
test generation, and again the nesting capability is useful to model large systems. 
For example, Teradyne’s commercial tool TestMaster is based on an 

extended hierarchical FSM model, i.e. it employs hierarchy and variables (but 
not concurrency). 

In this paper we will summarize recent work on the impact of adding hierar- 
chy to FSM, when added alone or in conjunction with concurrency. Most of the 
results come from the papers IAV9slAkV991^ymi to which we refer for more 
details. In Section 2 we discuss hierarchical state machines: their expressive po- 
wer and succinctness as compared to ordinary FSM’s, the effect of nondetermi- 
nism vs. determinism in this context, the complexity of basic operations such as 
checking for emptiness, universality, equivalence etc, and the model checking of 
properties on hierarchical state machines. In Section 3 we discuss the effect of 
combining concurrency with hierarchy and address similar questions. In Section 
4 we discuss hierarchical message sequence charts. In Section 5 we comment 
briefly on issues of testing, and we conclude in Section 6. 



2 Hierarchical FSMs 

There are many variants in definitions of finite state machines, depending on 
whether one views them as language acceptors/generators (automata), or ma- 
chines that react with their environment and produce output (eg. Mealy and 
Moore machines, I/O automata), or logical models (Kripke structures) etc. For 
concreteness, let’s settle with the automata definition. An ordinary basic finite 
state machine M consists of a finite set Q of states, an initial (or entry) state 
qo, a set of final (or exit states) F, a finite input alphabet E, and a set E of 
transitions, E Q Q x E x Q. Finite state machines are usually drawn as direc- 
ted graphs with the states as nodes and the transitions as edges labeled by the 
inputs. The language L{M) accepted by M is the set of strings over E that label 
paths from the initial state to a final state. 

Hierarchical machines (HSM) are finite state machines whose states are them- 
selves other basic FSM’s or previously defined hierarchical machines. For sim- 
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plicity we just give the formal definition for the single entry-single exit case. 
Formally, HSM’s are defined inductively as follows. In the base case, a basic 
FSM is a hierarchical machine. Inductively, suppose that is a set of HSM’s. 
If is a FSM with set of states Q and /i is a labeling function ^ : Q Ai that 
associates each state q £ Q with a HSM in M then the triple is a 

HSM. 

A hierarchical machine H represents in a compact notation a corresponding 
ordinary FSM flat{H), the fiattened version of H. The fiat FSM can be con- 
structed inductively as follows. In the base case, flat{H) = H. For the inductive 
step, if iJ = then replace in the FSM N each state g by a copy of the 

fiat version flat{^{q)) of its corresponding machine; note: if several states map 
to the same machine of Ai then we replace them by distinct copies of the fiat 
machines with distinct states. Transitions of N coming into state q are directed 
to the entry state of the copy of flat{fi{q)) and transitions of N coming out of 
q are now coming out of the exit state of flat{^{q)). If the entry state of N is 
qo and the exit state is qf, then the entry state of flat{H) is the entry state of 
flat{fj,{qo)) and the exit state of flat{H) is the exit state of flat{^{qf)). 

The definition of hierachical machines for the multiple entry - multiple exit 
case is similar, except that one has to specify in this case the set of entry and 
exit states, and to specify for every transition (g, a, q') of the top level FSM N 
in addition an associated exit state of ii{q) and an entry state of /r(g'). We omit 
here the formalization. 

An HSM H can be represented by a rooted DAG, whose leaves are basic 
FSM’s and each internal node u has an associated FSM Mu and a mapping 
from the states of to the children of u. The size of the (representation of 
the) HSM H is the sum of the sizes (numbers of states and transitions) of all 
the machines Mu at the nodes of the DAG. 

Note that the same machine Mj may be used by many other machines Mi 
and by many of their states. This reuse permits exponential succinctness as 
compared to the fiattened machine. Gonsider for example the HSM depicted in 
Figure 1, over a unary alphabet S = {a}. At each level, each Mi has two states 
that are mapped to the machine Mi_i at the previous level. The size of the HSM 
in this example is 0{n). However, the fiattened machine in this case is simply 
a path of length 2" — 1. Thus, the machine is a counter that accepts just one 
string, the string of length 2” — 1. 

2.1 Expressibility 

Hierarchical state machines are essentially pushdown machines with a pushdown 
stack that is bounded by the input. They capture of course the same languages 
as ordinary FSM (regular languages), they gain only in succinctness. If a HSM 
H is defined by a DAG D of r machines {Mi}, each with at most n states, 
and if the nesting depth is m (i.e. the maximum length of a path in the DAG), 
then the fiattened FSM has at most 0(n™) states: every state of flat{H) can 
be represented as a tuple {q\,q 2 , ■■■qt), t < m, of states of the given machines 
Mi, where qi is a state of the top-level machine, <72 is a state of fJ,{qi), and so 
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M, 



* 0^-0 




M, 




Fig. 1. An exponential counter 



forth, i.e. every other state qj of the tuple is a state of the machine to which the 
previous state qj-i of the tuple is mapped. Thus, hierarchical machines can be 
translated to ordinary FSM’s at an exponential cost. This is in general inherent. 

We can consider the impact of hierarchy on the relation between determinism 
and nondeterminism. Recall that a FSM is deterministic if for every state q and 
every input symbol a there is at most one transition out of q labeled a. We say 
that a HSM is deterministic if its flat FSM is deterministic. 

In the case of ordinary FSM’s there is of course an exponential gap in suc- 
cinctness between deterministic and nondeterministic FSM’s. For hierarchical 
machines the gap becomes doubly exponential. That is, there are (nondeter- 
ministic) HSM’s of size n such that the smallest equivalent deterministic HSM 
has size doubly exponential in n. An example of such a family of languages is 
the set Ln of strings w of length 2 • 2" such that there is an index i for which 
Wi = Wi+ 2 ". A nondeterministic HSM of linear size can guess i, record count 
2” as in Figure 1, and check that Wi = ^^+ 2 ". On the other hand, deterministic 
HSM’s need doubly exponential size for this language |AKY99j . 

Of course, the gap between nondeterministic HSM’s and deterministic HMS 
or even basic FSM’s is no worse than doubly exponential: one exponential suffices 
to translate from HSM to nondeterministic basic FSM, and another exponential 
to deterministic FSM. 

Comparing deterministic hierarchical machines with nondeterministic basic 
FSM’s, there is an exponential gap in both directions. That is, one the one hand, 
there is a family of languages that is accepted by linear size nondeterministic 
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FSM’s but needs exponential size for deterministic HSM’s; one such language is 
the set of strings w of length 2n such that Wi = Wi+n for some i. On the other 
hand, there is a language that can be accepted by a linear size deterministic 
HSM but needs exponential size for nondeterministic FSM’s; one such language 
is the set of strings of the form w^w^, i.e. the second half is the reverse of 
the first half w of the input string. 



2.2 Operations 

Usual operations of interest for a state machine and its language L are emptiness 
(is L = 0?), universality (is L = U*), and complementation (construct a machine 
that accepts the complement). For pairs of state machines Mi, M 2 , operations 
of interest include intersection (is L{Mi) DL{M 2 ) = 0?, and compute a machine 
that accepts the intersection), and comparison of the two machines: equivalence 
(is L{Mi) = L(M 2 )?) and containment or inclusion (is L{Mi) C L(M 2 )?) We 
discuss the complexity of these operations for hierarchical machines. 

Emptiness. This is equivalent to a reachability problem in the flattened ma- 
chine: can the initial state reach a final state? This problem can be solved in 
linear time in the size of the HSM by a simple dynamic programming search algo- 
rithm. In the case of flat FSM’s the problem is in NL (nondeterministic logspace) . 
In the case of hierarchical machines, the problem becomes P-complete. 

Universality. As in the case of basic FSM’s, this is a harder problem. As is 
well-known, for basic FSM’s universality is PSPACE-complete. Hierarchy costs 
an additional exponential in this case: universality for HSM’s is EXPSPACE- 
complete. Note that universality is emptiness for the complement, and emptin- 
ess is easy (hence complementation is hard). For deterministic HSM’s we can 
complement them easily, and hence we can solve universality in linear time. 

Intersection. For basic FSM’s intersection is a polynomial operation. This 
holds also for intersection of a HSM with a FSM: that is, given a HSM H and 
a (basic) FSM M, we can define their product, which is another HSM H' such 
that L(H') = L{H) fl L{M). The size of H' is bounded by |H||Mp; basically, 
the product involves at worst pairing every machine in the representation of H 
with a copy of M initialized at each state of M. However there is no such simple 
product construction for two HSM’s: determining whether the intersection of the 
languages of two HSM’s is empty turns out to be PSPACE-complete. 

Equivalence and Inclusion. Since universality is EXPSPACE-hard, it follows 
that equivalence and containment are at least as hard. Indeed they are both in 
EXPSPACE, by flattening and complementing one of the machines, and forming 
the product with the other machine. In the case of deterministic HSM’s (for 
which universality as well as complementation is easy), inclusion turns out to be 
PSPACE-complete. As for equivalence of deterministic HSM’s, it can be done in 
PSPACE but the precise complexity is open. 
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2.3 Verification 



A hierarchical state machine H is considered now as modeling a system. The 
model checking problem asks whether a given system model satisfies a given pro- 
perty P. The model checking problem has been studied for a variety of classes 
of properties, and there are several software tools to automatically perform the 
verification for finite state systems (eg. SPIN fHol97j . COSPAN jKur04j . SMV 
[IIVI c IVI 03j ) . Typically, properties are defined either using automata, or some form 
of logic (linear time or branching time). Usually models are finite state struc- 
tures that have labelled nodes instead of edges; specifically, there is a finite set 
of propositions, and the nodes (states) are labelled by the propositions satis- 
fied at the state. We defined above hierarchical machines using edge labels (to 
conform with language generation), but one can equally well define them using 
node labels; there is not a significant difference between edge-labelled and node- 
labelled machines. We say that the hierarchical machine H satisfies a property 
P if flat{H) does. 

The simplest kind of property is a state invariant, i.e., the property that 
starting from the initial state, the system stays within a specified subset of 
states. This amounts to a simple reachability problem and can be solved in 
linear time in the size of the HSM. So-called ‘safety’ properties that depend on 
finite computations can be also reduced to a reachability problem. 

A much richer class of properties is the class of linear time properties ex- 
pressed by linear temporal logic (LTL) [Unu zg or by Buchi automata. These 
are properties on the infinite computations of the system (i.e. paths of the state 
machine). The model satisfies a given property if all of its paths starting from 
the initial state satisfy the property. A Buchi automaton A is like a usual finite 
string automaton (i.e. a basic FSM); the only difference is that its language is 
defined to be a set of infinite strings, namely the strings that label infinite paths 
(starting from the initial state) which go infinitely often through one of the ac- 
cepting states. We will not define LTL here, but suffice it to say that every LTL 
formula </> can be translated to an equivalent Buchi automaton A of size 
[IVWSBj : and in the automata theoretic approach (and in tools like SPIN) this is 
the way that LTL properties are checked, i.e. by conversion to automata. This 
does not hurt the complexity, because the exponential dependence on the size 
of the formula is unavoidable (but formulas are typically quite small, so this is 
tolerable). It is more convenient to use a Buchi automaton to specify the bad 
computations, those that signify an error, rather than the correct computations. 
Then the model checking problem for a model M amounts to the question of no- 
nemptiness of the intersection L{M) fl L{A), where the model M is now viewed 
as generating a set of infinite strings. 

The core algorithmic problem in this case is not just reachability, but the 
following Cycle Detection problem: Given a directed graph (a basic FSM) with 
an initial node and a set of specified ‘special’ nodes, can the initial node reach a 
cycle that contains one of the special nodes? The model checking problem for a 
flat FSM M and a Buchi automaton A is reduced to this cycle detection problem, 
by simply taking the product of M and A and letting the special nodes be those 
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that correspond to accepting nodes of A. The cycle detection problem can be 
solved in linear time, either by computing the strongly connected components 
of the graph, or by an alternative “nested depth-first search” algorithm that is 
useful for on-the-fly verification k;vwvf>'ii . 

The model checking problem for hierarchical machines can be similarly 
reduced to the cycle detection problem for HSM’s: As we mentioned earlier, we 
can form the product of the HSM iJ with the automaton A, which is another 
HSM JI'. Then H and A have a nonempty intersection iff flat(H') has a reach- 
able cycle that contains a special node. Finding strong components of flat(H') 
is not convenient (there is too many of them), but the cycle detection problem 
can still be solved in linear time in the size of H' by suitable adaptation of the 
nested depth first search algorithm, see This yields an algorithm for mo- 

del checking of an HSM H with a Buchi automaton A that is linear in the size 
of the HSM and quadratic in the size of A (usually the system is much larger 
than the property, so the dependence on the system size is the more critical 
measure). Thus, we can verify a herarchical machine without flattening it, i.e., 
there is essentially no penalty for the exponential succinctness of hierarchy in 
this case. 

The other major brand of temporal logic is branching temporal logic. In the 
case of basic FSM’s, checking a model M for a formula cj) in branching time logic 
CTL is easier than LTL: it can be done in time 0(|M||(()|), i.e. linear in both 
the size of the model and the specification. In the case of hierarchical systems, 
if we flatten the hierarchical machine and apply the pure FSM algorithm, the 
complexity of the algorithm will be exponential in the size of the hierarchical 
model and linear in the size of the formula \(j)\. Typically, formulas are small 
and models are large, so this is not a good trade off. There is an alternative 
better algorithm, which makes the trade off in the other way, and has complexity 
exponential in the size of the formula. The dependence on the model size is 
affected by the number d of exit nodes allowed in the nested machines. Namely, 
the model checking problem for a HSM H and a CTL formula 4> can be solved 
in time 0(|iL|2l'^l‘^), and also in polynomial space. Thus, in general we can again 
avoid flattening the HSM. In the single exit case, the time complexity is linear 
in the size of the HSM, though in the multiple exit case it grows exponentially 
with the number of exit states. These dependencies are probably inherent: In the 
single-exit case, if the formula is part of the input then the problem is PSPACE- 
complete, so the complexity presumably has to be exponential in either the model 
size or the formula size, and of course we are better off having the exponential 
dependence on the formula rather than the model. In the multiple exit case, there 
is a fixed formula for which the problem is PSPACE-complete (so the model size 
has to enter exponentially somehow). We summarize in the table of Figure 2 
the complexity of the various analysis problems in the checking of properties for 
HSM’s, as compared to FSM’s. 
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Fig. 2. Summary of FSM and HSM model checking 



3 Concurrency and Hierarchy 

A concurrent (or communicating) hierarchical state machine, CHM, combines 
concurrency and hierarchy in an arbitrarily nested manner. We define it formally 
here for the single entry -single exit case; the definition for the multiple entry/exit 
case is similar. 

A CHM is defined again inductively. In the base case, every ordinary FSM is 
a CHM. For the induction step, a CHM H is either a hierarchical combination 
of previously defined CHM’s exactly as before, or it is a concurrent (parallel) 
combination M\ || M2 || • • • || Mk- 

The fiat basic FSM, flat{H) is defined again inductively. It is the same 
as before in the basis case and in the case of a hierarchical combination. In 
the concurrent combination, flat{H) is the product of the FSM’s flat{Mi), 
defined as follows. Assume that flat{Mi) has set of states Qi, alphabet Si, 
initial state q(, final state q), and transition relation Ei. Then the state space of 
H is Qi X ■ ■ ■ X Qk, the alphabet is Ai U • • • U Sk, the initial state is (^q, • • • , q^), 
the final state is {qj, • • • , q^j), and the transition relation has a transition labelled 
a from state (mi, • • • , Uk) to state (ui, • • • , Vk) if every component machine Mi, 
whose alphabet Si contains the symbol a, has a transition from Ui to Vi labelled 
a. The language L{H) of a CHM H is the language of its corresponding FSM 
Jlat{H). 

A concurrent hierarchical machine H can be represented by a rooted DAG 
(directed acyclic graph) corresponding to the way it is built from basic FSM’s. 
The leaves are basic FSM’s and each internal node is either labelled by the 
concurrent combinator ||, or corresponds to a hierarchical combination and is 
labelled by a machine N and a mapping /i from the states of N to the children 
of the node. Since || is an associative operator, we can assume that the children 
of II nodes are not themselves || nodes (otherwise we can combine them). The 
size of the CHM M is the size of its DAG representation and all the machines in 
it. Two important parameters for a CHM is the depth m of the DAG (length of 
the longest path) and the width w which is the maximum number of components 
of a concurrent node of the DAG (i.e. node labelled ||). 

3.1 Expressiveness 

Concurrent hierarchical state machines still define of course only regular langu- 
ages. If a CHM H has width w and depth m, and every FSM in its definition 
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has n states, then flat{H) has 0{n^ ) states, that is, flattening causes now in 
general a double exponential blow up. 

This is in general unavoidable. In fact concurrency adds two exponentials 
even compared to hierarchical state machines: There is a family of languages 
that can be recognized by linear size CHM’s but which require HSM’s (and 
hence also basic FSM’s) of double exponential size. Furthermore, there is a triple 
exponential gap in the worst case between CHM’s and deterministic hierarchical 
state machines (and hence also deterministic FSM’s). An example of such a 
language family is L = {wo#wi# • • • #Wk\\wi\ = 2" for each i and Wi = Wj for 
some *, j }. 

The following figure summarizes the expressibility relationships between the 
different kinds of machines. 



2exp 

CHM HSM 




Det FSM 



Fig. 3. Summary of succinctness relations 



3.2 Operations 

Emptiness (Reachability). This cannot be done efficiently any more. Even for 
a simple concurrent combination (intersection) of basic deterministic FSM’s we 
know that the problem is PSPACE-complete , and we saw in the previous 

section that the same is true for the concurrent combination of just two HSM’s. 
Mixing concurrency with hierarchy adds another exponential: The reachability 
(and emptiness) problem for CHM’s is EXPSPACE-complete. 

There are two exponential contributions to this complexity, one due to con- 
currency and one due to hierarchy. For certain classes of machines, we can avoid 
one of the exponentials, reducing the complexity to ‘only’ PSPACE. For exam- 
ple, one case is when all the concurrency is at the top. That is, we have a set 
of concurrent interacting components, where each component is represented by 
a hierarchical state machine. Reachability is in this case in PSPACE (and is of 
course PSPACE-complete). Another case is when the hierarchical construction 
at a level does not expose the internals of the submachines to higher levels. 
Note that in the concurrent composition the alphabets of the components are 
critical. Informally, call a CHM H “well-structured” if the hierarchical nesting 
always hides the alphabet of the nested machines as far as further concurrent 
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Fig. 4. Summary of complexity of operations 



combinations with other machines are concerned, that is, in every concurrent 
combination Mi || • • • || Mt in the definition of H (where Mi are without loss 
of generality hierachical nodes), only the alphabets of the top level components 
Mi enter in the composition; i.e., the alphabet of the nested submachines are 
hidden. Then reachability can be solved in PSPACE and in time 0(fcn™), where 
k is the number of operators (internal nodes of the DAG), w is the width, and 
n is the maximum number of nodes of a FSM in the definition of H . 

Universality. The universality problem turns out to be again harder, and is 
as bad as it can get: It is complete for double exponential space. 

Intersection. The intersection of two CHM’s Hi,H 2 is itself another CHM 
Hi II H 2 . Emptiness can be done in EXPSPACE (and is complete). 

Equivalence, Inclusion. Since universality is hard, the same is true of equi- 
valence and inclusion, which are also complete for double exponential space. 

Finally, model checking for a CHM and a Buchi automaton is, as in the case 
of simple reachability, complete for exponential space (with respect to the size of 
the model), i.e., if we nest arbitrarily concurrency and hierarchy we have to pay 
in general for both the concurrency and the hierarchy. It is worth noting howe- 
ver, that these lower bound constructions that give this pessimistic complexity, 
take advantage of the fact that one can define succinctly (because of the reuse) 
systems with exponentially many parallel components; this may be unrealistic 
for significant sizes, and hence the lower bounds may be unduly pessimistic. 

The table of Figure 4 summarizes the complexity results of basic operations 
for the different kinds of machines. 

4 Hierarchical Message Sequence Charts 

These represent the extreme case where all the concurrency is at the bottom 
of the hierarchy. A basic message sequence chart (MSC) represents the partial 
order of the message exchanges in a concurrent execution of a set of processes, 
see Figure 5. The vertical lines represent the processes, time proceeds down 
vertically for each process, and the arrows represent the messages. Send and 
receive events are labelled from some alphabet E (for example, the messages 
sent or received, or any other labels). An MSC M represents a set of linear 
executions, namely all the linearizations of the partial order, and thus it has a 
corresponding finite language L{M), the strings that label these linearizations. 
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Fig. 5. A basic message sequence chart 



Using simple MSC’s as basic building blocks, one can combine them in a 
hierarchical fashion, the same way as with hierarchical state machines. We give 
for simplicity the definition for the single entry - single exit case (which can 
be extended to the multiple entries and exits). An MSC-graph is a directed 
graph, with a distinguished initial (entry) node and a final (exit) node, along 
with a mapping /i that associates with every node a basic MSC (labelled from 
an alphabet S). More generally, we can define inductively a hierarchical MSC 
graph (HMSC) to be a hierarchical combination of previously defined HMSC’s. 
An HMSC can be represented by a rooted DAG, where each leaf is labelled by a 
basic MSC’s, and each internal node u is labelled by a graph and a mapping 

from the nodes of Gu to the children of u. A hierachical MSC H can be of 
course flattened to a simple MSC-graph flat{H) (at an exponential blow up), 
the same way that a hierarchical machine is flattened to an FSM. 

The executions of a system modeled by an MSC-graph (or HMSC) correspond 
to the paths of the graph starting at the initial node. There are two interpreta- 
tions for the concatenation of the MSC’s of the different nodes along the path. 
In the synchronous interpretation, the concatenation of two nodes u, v is viewed 
as consisting of executing all actions of the MSC of u followed by the actions of 
the MSC of v; that is, when executing a path, every node represents a block of 
activity that completes before going on to the next block. In the asynchronous 
interpretation, the MSC’s of the nodes along the path are pasted together, sepa- 
rately process by process, i.e. by identifying for each process the end of its line 
in one node with the beginning of its line in the next node, thus forming a long 
MSC spanning all the actions of all the nodes. The synchronous concatenation is 
probably closer to the way system engineers and designers think when they draw 
HMSC’s partitioning the activities into blocks. The asynchronous interpretation 
is however closer to what one would get if one was to implement directly the 
HMSC, i.e. without any other coordination or constraints than what is depicted 
explicitly in the HMSC. 
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For each of the interpretations, we can define the language of an HMSC, by 
considering all the paths starting at the initial node and taking the labelings 
of all the corresponding linear executions. (We can either define a language of 
finite strings, if we require that the paths terminate at a final node, or define a 
language of potentially infinite strings). 

In the synchronous interpretation, the language of an HMSC H is regular. 
A hierarchical state machine H' can be constructed from H by replacing in the 
representation of the HMSC each basic MSC (ie. leaf of the DAG) by a FSM 
that recognizes the language of the MSC. Then L{H) = L{H'). This is a very 
benign and convenient case, because we can use directly the results for hierarchi- 
cal state machines; it means in particular that the graphs and the hierarchy of 
the HMSC’s do not add anything to the complexity of problems like reachability, 
model checking etc. beyond what’s in the simple basic MSC’s. If an individual 
basic MSC is a ‘large’ one, then the translation to an automaton, as well as 
answering basic questions about the MSC can be expensive because of the con- 
currency. Namely, if an MSC M has k processes with I events in each process, 
then the equivalent FSM can have states, and in fact answering a simple que- 
stion such as, ‘does a given string x belong to L{M)’ is NP-complete. However, 
in many cases in hierarchical MSC’s most of the size of the HMSC is in the use 
case graph; the individual MSC’s are not large and the number of processes is 
often small (for example in telephony, processes correspond to entities such as 
originating and terminating switches and users). 

The hierarchy does not introduce any additional complications beyond the 
basic MSC’s under the synchronous interpretation. Thus, for example reachabi- 
lity, or membership, or the model checking problem for a hierarchical HMSC H 
of size n and a Buchi automaton A of size a is still “only” NP-complete, and can 
be solved in time 0{a?"nl^), where k and I are as above the number of processes 
and number of events per process in a basic MSC. Thus, hierarchical MSC’s can 
be analyzed under the synchronous interpretation without paying a penalty for 
the hierarchy, and if the basic MSC’s are small or do not have too much con- 
currency (which is often the case) then the complexity is low. Furthermore, for 
properties that are linearization-independent (i.e., if the property is such that one 
linearization of an ordinary MSC satisfies the property iff any other one does), 
then the HMSC H can be checked in linear time in its size: substitute each basic 
MSC in the definition of H by an automaton (just a path) that accepts just one 
linearization (instead of all of them); the resulting hierarchical machine satisfies 
the property iff the given hierarchical MSC does (for linearization-independent 
properties, there is no difference between synchronous and asynchronous inter- 
pretation as far as satisfaction of the properties is concerned, so this holds also 
for the asynchronous case). 

In the asynchronous interpretation, the language of an HMSC is in general 
not regular, and in fact model checking an MSC-graph for a property given by a 
Buchi automaton is undecidable. There is a syntactic condition that ensures that 
the language is regular, and which leads to decidability. Given a set S of basic 
MSC’s, we can construct a corresponding communication graph CG{S) among 
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the processes: this is a directed graph with the processes as nodes and with an 
arc Pi — >■ Pj if process Pi sends a message to process Pj in some MSC of S. Given 
an MSC-graph G (respectively, a hierarchical HMCS H) we say that it is boun- 
ded if for every cycle K of G (resp. of flat{H)) that is reachable from the initial 
state, the corresponding communication graph GG{K) consists of one strongly 
connected component and possibly some isolated nodes. If an HMCS is bounded 
then its language is regular fAY9b| . Conversely, it is shown in pHlVllN'roTl] that 
every regular MSC language that is definable by an MSG graph (or equivalently, 
is finitely generated) then it can be defined by a bounded MSC graph. Boun- 
dedness of an HMSC can be determined without flattening the hierarchy in time 
0{nl2^) where n is the size of the HMSC, k is the number of processes and I is the 
number of events per process in a basic MSC (this mild exponential dependence 
on k is unavoidable, as the boundedness problem is coNP-complete) . We can 
analyze algorithmically bounded HMSC’s by constructing an equivalent FSM. 
The complexity is higher in this case: model checking a Buchi automaton pro- 
perty is PSPACE-complete for bounded MSC graphs and EXPSPACE-complete 
for HMSC’s (the problem is hard even for a fixed property, a fixed number of 
processes k, and fixed number of events I per basic MSC). 



5 Testing 

In testing, we have a design model M (eg., a finite state machine) and the 
actual system S (the implementation under test) and we wish to generate a 
suitable set of test cases based on the model, which will be applied to the system 
S to test its correctness, i.e., whether it conforms with the model. There is 
extensive theoretical and practical work on test generation for FSM’s, see for 
instance jTTnnj for a survey. There is a variety of testing criteria that can be 
used depending on the extent of testing that can be afforded; it can range from 
checking sequences that provide certain guarrantees on the relation between M 
and S (usually at a rather high cost), to a more commonly applied simpler set 
of coverage criteria, such as covering all the states and transitions of the model. 

Suppose that we have a hierarchical finite state machine model and let’s 
consider the most common criterion of ‘transition coverage’, i.e. finding a set of 
test sequences (paths from the initial state) that cover all the transitions. There 
are two interpretations of this requirement: in the first interpretation we seek to 
cover all transitions of M and its nested submachines in all possible contexts, i.e. 
if a submachine is reused several times, we want to cover all its transitions in all 
the uses; this is equivalent to covering all the transitions of the flattened machine 
flat{M). A covering test set can be computed in time polynomial in the size of 
the fiat FSM using flow techniques to minimize the number of test sequences 
and the total length of the tests. Such an algorithm has been implemented in 
Lucent’s uBET toolset; after fiattening the hierarchical MSC graph, an optimal 
test set is generated, and for each test path, the basic MSC’s of the nodes along 
the path are combined to form a test scenario in the form of an MSC. 
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An alternative (and more economical) interpretation of the transition cover- 
age requirement is to cover each transition of the hierarchical state machine and 
each nested submachine at least once overall, i.e., if a submachine is used several 
times, we only need to cover each transition in only one occurrence (and different 
transitions could be covered in different copies). One justification for this could 
be for example if the same implementation of the submachine is used in the 
different copies, in which case testing a transition in one instance suggests that 
it will likely work properly in the other instances as well (although that is not 
necessarily guarranteed) . Of course, the number of tests needed under this wea- 
ker requirement is generally much smaller than the full coverage requirement; 
for example, the number of test cases needed is certainly bounded by the size 
of the hierarchical machine, as opposed to the potentially exponentially larger 
size of the flattened machine. Computing an optimal (minimum cardinality) test 
set for the weaker coverage is much harder than for simple FSM’s. For example, 
even in the 2-level case the problem is at least as hard as Set Cover (hence it 
is hard to compute or to approximate the smallest test set within a constant 
factor) . 

Generally, very little systematic work has been done on test generation for 
hierarchical state machines. 



6 Conclusions 

Hierarchy is a useful construct in enhancing the expressive power of finite state 
machines and facilitating the modeling of large systems. We summarized recent 
work on the theory of hierarchical finite state machines and related specification 
mechanisms (HMSC’s). We discussed the effect of concurrency and nondetermi- 
nism, the classification of their expressive power, and the complexity of various 
algorithmic problems. The picture that emerged is rather comprehensive. One 
specific problem that remains open is the equivalence of deterministic HSM’s. 
More broadly, we did not touch on the interaction with variables (extended 
HSM’s). Also, as we mentioned more work remains to be done on the testing of 
hierarchical FSM’s. 

Another issue concerns design problems, such as the structuring and modu- 
larization of the behavioral aspects of systems, to form a suitable hierarchical 
model. For example, suppose that we have a legacy system which we want to 
capture formally in a requirement model like uBET. We have available or can 
generate a large number of executions of the system to obtain a library of MSC 
scenarios. How do we go from this collection of MSC’s to a more higher level, 
structured view of the system, in the form of a hierarchical use case graph built 
from basic MSC’s? 

Similarly, suppose that we have a test model of a system in the form of a 
large FSM; for example, we produced recently automatically such a model for a 
large Lucent enterprise switch from the test platform in use for the testing of the 
switch [EYOOj : the FSM model has hundreds of thousands of states. The model 
can be used for targeted test generation to meet a variety of criteria. However, the 
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machine is obviously too large for a person to look at, or to manipulate directly, 
for example to modify as the system evolves, to add new features etc. Can 
we structure such an FSM and obtain an equivalent (hopefully much smaller) 
hierarchical FSM? There is an elegant classical decomposition theory of FSM’s 
| |HS(i(i| which may be useful in this context. 
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Abstract. We add name groups and gronp creation to the typed am- 
bient calculus. Group creation is surprisingly interesting: it has the effect 
of statically preventing certain communications, and can thus block the 
accidental or malicious escape of capabilities that is a major concern in 
practical systems. Moreover, ambient groups allow us to refine our earlier 
work on type systems for ambient mobility. We present type systems in 
which groups identify the set of ambients that a process may cross or 
open. 



1 Introduction 

The Ambient Galculus is a process calculus based on local communication and 
on process mobility. The basic, untyped, calculus can be decorated with static 
information to restrict either local communication, or mobility, or both. 

Exchange control systems can be used to restrict communication. In jGGDh) 
we have investigated exchange types, which subsume standard type systems for 
processes and functions, but do not impose restrictions on mobility. 

Mobility control systems can be used to restrict mobility. In we in- 

vestigate immobility and locking annotations, which are simple predicates about 
mobility. 

The goal of this paper is to refine our previous work on mobility control, 
by including in the type of a process static descriptions of the set of ambients 
it may cross, and the set of ambients it may open. To do so, we adopt a new 
construction of independent interest. Among the types, we introduce collections 
of names that we call groups] names belong to groups in the same sense that 
values belong to types. 

To understand how name groups arise, consider a typical static property we 
may want to express in a type system for the ambient calculus, informally: 

The ambient named n can enter the ambient named m. 

This could be expressed as a typing n : CanEnter{m) stating that n is a mem- 
ber of the collection CanEnter{m) of names that can enter m. However, this 
would bring us straight into the domain of dependent types, since the type 
CanEnter{m) depends on the name m. Instead, we introduce type-level groups 
of names, G, H, and restate our property as: 

The name n belongs to group G; the name m belongs to group iJ. Any 

ambient of group G can enter any ambient of group E[ . 
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This idea leads to typing judgments of the form: 

process P may cross ambients of group G 
process P may open ambients of group G 

The former reduces to immobility assertions when a process can cross no groups; 
the latter reduces to locking assertions, when members of a group can be opened 
by no process !CGG99| . 

Among the processes, we then introduce an operation, {vG)P, for creating 
new groups. Within P we can introduce new names of group G. The binders 
for new groups, (i^G), extrude in much the same way as binders for new na- 
mes, {vn:G). Because of extrusion, group binders do not impede the mobility 
of ambients that are enclosed in the initial scope of fresh groups. However, sim- 
ple scoping restrictions prevent names of a fresh group from ever being received 
outside the initial scope of the group. 

Therefore, we obtain a flexible way of protecting the propagation of names. 
This is to be contrasted with the situation in the untyped 7r-calculus and ambient 
calculus, where names can (intentionally, accidentally, or maliciously) be extru- 
ded arbitrarily far, by the automatic and unrestricted application of extrusion 
rules. 

We organise the paper as follows. In the remainder of this opening section we 
review the basic untyped ambient calculus. Section ^describes the typed ambient 
calculus with groups — obtained by enriching our exchange type system irrnini 
with groups. Section 0 enriches the system of Section to control ambient ope- 
ning. In Section 0 we define a system in which the type of a process records both 
the groups it may open and the groups it may cross. Section^ formalizes safety 
properties guaranteed by typing. Section 0 concludes and discusses related work. 

A technical report contains proofs omitted from this paper (GGGDDj . 



1.1 The Untyped Ambient Calculus (Review) 

An ambient is a named boundary whose interior contains a collection of run- 
ning processes, possibly including nested subambients. We explain the untyped 
ambient calculus elsewhere in detail, but here we introduce its central 

features via a standard example: a[p[out a. in fo.(c)]] | b[open p.{x) .x\\]. 

Intuitively, this example represents a packet named p being sent from a ma- 
chine a to a machine h. The example consists of the parallel composition (indica- 
ted by the | operator) of two ambients, named a and h. The brackets [. . .] repre- 
sent ambients’ boundaries. The process p[out a.in b.{c)] represents the packet, a 
subambient of ambient a. The name of the packet ambient is p, and its interior is 
the process out a.in b.{c). This process consists of three sequential actions: exer- 
cise the capability out a, exercise the capability inb, and then output the name c. 
The effect of the two capabilities on the enclosing ambient p is to move p out of 
a and into 6, to reach the state: a[] | &[p[(c)] | open p.{x).x'^]. The interior of a is 
now empty. The interior of b consists of two running processes, the subambient 
p[(c)] and the process open p.{x).x^. The latter is attempting to exercise the 
open p capability. Previously it was blocked. Now that the p ambient is present. 
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the effect of openp is to dissolve the ambient’s boundary. Hence, the interior of 
b becomes the process (c) | (a:).x[]. This is a composition of an output (c) with 
an input {x).x\\. The input consumes the output, leaving c[] as the interior of b. 
Hence, the final state of the whole example is a[] | 6[c[]]. 

The 0 process represents inactivity; the notation a[] for an empty ambient 
named a, used above, is actually short for a[0]. There are also replication and 
restriction constructs. A replication !P behaves the same as an unlimited number 
of parallel copies of P. A restriction (i/n)P creates a new name n with scope P. 



2 The Typed Ambient Calculus with Groups 



We start with the typed ambient calculus of [(Xlhhj and we add a new process 
construct, {vG)P, to create a new group G with scope P. Correspondingly we 
add a new type construct, G[T], for the type of names of group G that name 
ambients that contain T exchanges. 

The construct G[T] is actually a refinement of the construct Amb [T] of 
where Amb can now be seen as the group of all names. It is conceivable to intro- 
duce a subtype ordering on groups, with Amb as the maximal element. However, 
subtyping may help capabilities escape, particularly in the presence of a maximal 
element; we do not consider these extensions in this paper. 

We can now write, for example, the following typed process: 



(vCh)(vMsg){vc: Ch[Msg[Shh\\){vm\Msg[Shh\)c[{rn) \ {x\Msg[Shh\) ,x\^ 



This creates two groups Ch and Msg and two names c and m belonging 
to those groups. The types ensure that only messages, that is, names of type 
Msg[Shh\, can be exchanged inside an ambient named c, as happens in the rest 
of the process. (The type Shh prohibits exchanges; names of type Msg[Shh] are 
in group Msg, and name ambients in which exchanges are prohibited.) 

The types of the ambient calculus with groups are the same as in Irani, 
except that G[T] replaces Amb[T]. We have types W for messages. Messages can 
be either names of type G[T], or capabilities of type Cap[T], We also have types 
for processes, T, that classify processes according to the type of message tuples 
they exchange (if any). 



Types: 

'W ::= 

G[T] 

Cap[T] 

S,T::= 

Shh 

WiX---xWk 



message type 

ambient name in group G with T exchanges 
capability unleashing T exchanges 
exchange type 
no exchange 

tuple exchange (1 is the null product) 



j 



Expressions (messages) and processes are also the same as in except 

that we add processes {vG)P and include the objective moves of 
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The movement primitives of the untyped calculus, illustrated by the process 
p[out a. in b.{c)] from Section H~T1 are called subjective moves; the capabilities 
out a and in b move the ambient p from the inside. In the typed calculus, we also 
take objective moves as primitive. In an objective move goN.M[P], the capability 
N moves an ambient M[P] from the outside by following the path encoded by N, 
and once there starts the ambient M[P], In the untyped calculus, we can define 
an objective move go N.M[P] to be short for the process {i/k)k[N.M[out k.P]] 
where k is not free in P. As we found in our previous work , a primitive 

typing rule for objective moves allows more refined typings than are possible 
with only subjective moves. 



Expressions and processes: 



'm,N ::= 


expression 


P,Q,R-.:= 


1 

process 


n 


name 


(vG)P 




group creation 


in M 


can enter M 


(un:W)P 




restriction 


out M 


can exit M 


0 




inactivity 


open M 


can open M 


P\Q 




composition 


e 


null 


\p 




replication 


M.M' 


path 


M[P] 




ambient 






M.P 




action 






(xi'.Wi, . . .,Xk-Wk).P 


input 










output 






go N.M[P\ 




objective move 

1 



This grammar allows the formation of certain nonsensical processes, where 
a capability is used in place of a name, as in (in n)[0], or vice versa, as in 
(vn:W)n.O. Such garbled processes are not typable in any of our type systems. 

In the processes (vG)P and (vn:W)P, the group G and the name n, respec- 
tively, are bound, with scope P. In the process (xi:W \, . . . , Xk-Wk)-P, the names 
xi, . . . , Xk are bound, with scope P. We identify processes up to consistent ren- 
aming of bound names and bound groups. We write fn(P) for the set of names 
free in process P, and we write fg(P), fg(W), and fg(T) for the sets of groups 
free in process P, message type W, and exchange type T, respectively. 

The following tables describe the structural congruence rules and the reduc- 
tion rules. The bottom four rules of structural congruence describe the extrusion 
behavior of the (vG) binders. Side conditions on these rules prevent violation 
of lexical scoping. The notation . . . , Xk-^Mk} used below in the re- 

duction rule for I/O denotes the outcome of a capture-avoiding simultaneous 
substitution, for each i G l..fc, of the expression for each free occurrence of 
the corresponding name Xi in the process P. 

Structural Congruence: 

'p = Q^ (vn-.W)P = (vn-.W)Q ' 

P = Q^ (vG)P = (vG)Q 
P = Q^ P\R=Q\R 



P = P 

Q=P^P=Q 
P = Q,Q = R^P = R 
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P = Q^IP=IQ 

P = Q^ M[P] = M[Q] P\Q = Q\P 

P = Q^M.P = M.Q {P\Q)\R=P\{Q\R) 

P = Q^ go N.M[P] = go N.M[Q] 

P = Q^{xi:Wi,...,Xk:Wk).P={xi:Wi,...,Xk:Wk).Q 



ni ri2 => (i'ni:Wi){i'n2'.W2) P = {i/n2'.W2){vni:Wi)P 
n ^ fn{P) {vn\W){P \ Q) = P \ {vn\W)Q 
riy^m^ {ixn:W)m[P] = m[{i^n:W)P] 



P\0 = P 
(iyn:W)0 = 0 
(z/G)0 = 0 

!0 = 0 



!P = P I !P 
e.P = P 

(M.M').P = M.M'.P 
go e.M[P] = M[P] 



{iyGi){jyG2)P = {i^G2){iyGi)P 
G ^ fg{W) => {vG){vn-.W)P = [i'n-.W){vG)P 
G i fg(P) => {uG){P \Q) = P\ {i^G)Q 
{vG)m[P] = m[{i'G)P] 



j 



Reduction: 

n[in m.P \ Q] \ m[R] — ?> m[n[P | Q] | P] P ^ Q ^ {vG)P — >■ {vG)Q 

m[n[out m.P | Q] | P] — >• n[P \ Q] \ m[R\ P ^ Q ^ [vn:W)P — >■ {vn:W)Q 
open n.P \ n[Q] P \ Q P^Q^P\R^Q\R 

I {xi.Wu...,Xk-.Wk).P P ^ Q ^ n[P] ^ n[Q] 

— >■ P {xi^Mi, . . . , Xk'^Mk} P' = P, P Q , Q = Q' ^ P' ^ Q' 

go {in m.N).n[P] \ m[Q] — )> m[go N.n[P] \ Q] 
m[go {out m.N).n[P] \ Q] ^ go N.n[P] \ m[Q] 

I I 

Next, we introduce the five basic judgments and the typing rules. Apart from 
minor adaptations, the main novelty with respect to is the rule with 

conclusion E h {vG)P : T. The assumptions of this rule are that E,G \- P : T 
and G ^ fg{T). The latter assumption prevents G from going out of scope in 
the conclusion. Typing environments, E, are given by the grammar E ::= 0 \ 
E,G I E,n\W. For each E, we inductively define dom{E) by the equations 
dom{0) = 0, dom{E, G) = dom{E) U {G}, and dom{E, n:W) = dom{E) U {n}. 



Judgments: 



Pho 


1 

good environment 


E'rW 


good message type W 


EhT 


good exchange type T 


E\~ M :W 


good expression M of message type W 


Eh P:T 

1 


good process P with exchange type T 

1 
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Typing Rules: 



E'^W n 


^ dom{E) E h <> G dom{E) 


1 

• • • EhWk 


0 h o E, n\W h o 

G G dom{E) E'rT E'rT 


E,G ho 

Eh o EhWi 


E h G[T] 


Eh Cap[T] 


Eh Shh EhWi 


X ■■■ xWk 


E',n-.W,E'''ro 


E h Cap[T] 


EhM-.Cap[T] EhM'-.Cap[T] 


E',n:W,E" h n : IT 


Eh e : Cap\T] 


E h M.M' : 


Cap[T] 


Ehn: GiS*] EhT 


Ehn: G[5] 


EhT Ehn: G[T] 


E\- inn : Cap[T] 


E h out n : Cap[T] E h open n : 


Cap[T] 


EhM:Cap[T] E'rP-.T E \~ M 


: G)^] EhP:S EhT 


E h M.P : T 




E h M[P] : T 




E,n:G[S] h P : T 


E,Gh P :T Gi fg{T) EhT 


Eh P-.T 


E h {iyn:G[S])P : T 


E h {vG)P 


’:T EhO:T 


Eh\P:T 


E\-P-.T E'rQ-.T E,m:Wi,... 


,nk-.WkhP:Wi X--- 


X Wk 


Eh P\Q :T 


E h {ni-.Wi,. 


..,nk:Wk).P:Wi x •• 


■ X Wk 


Eh Ml-. Wi ■■■ 


Eh Mk-.Wk 






Eh{Mi,...,Mk):WiX---xWk 






Eh N : Cap[S'] E h M : G[S'] E h P : S EhT 




go N.M[P] : T 

1 




1 



We obtain a standard subject reduction result. A subtle point, though, is the 
need to account for the appearance of new groups (Gi, . . . , Gk, below) during 
reduction. This is because reduction is defined up to structural congruence, and 
structural congruence does not preserve the set of free groups of a process. The 
culprit is the rule {h'n:W)0 = 0, in which groups free in W are not free in 0. 

Theorem 1. If E \- P : T and either P = Q or P ^ Q then there are G\, . . . , 
Gfc such that G\, . . . ,Gk, E \- Q \ T . 

3 Opening Control 

In this section, to control usage of the open capability, we add attributes to 
the ambient types, G[T], and the capability types, Gop[T], of the previous type 
system. (In the next section, to control usage of the in and out capabilities, we 
add further attributes.) 

To control the opening of ambients, we formalize the constraint that the 
name of any ambient opened by a process is in one of the groups Gi, . . . , Gk, 
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but in no others. To do so, we add an attribute °{Gi, . . . , Gk} to ambient types, 
which now take the form G[°{Gi, . . . , G^}, T]. A name of this type is in group G, 
and names ambients within which processes may exchange messages of type T 
and may only open ambients in the groups Gi , . . . , Gk- We need to add the same 
attribute to capability types, which now take the form Cap[°{Gi, . . . ,Gk},T]. 
Exercising a capability of this type may unleash exchanges of type T and ope- 
nings of ambients in groups Gi, . . . , Gk- The typing judgment for processes ac- 
quires the form E \- P : °{Gi, . . . , Gk}, T- The pair °{Gi, . . . , Gk}, T constrains 
both the opening effects (what ambients the process opens) and the exchange 
effects (what messages the process exchanges). We call such a pair an effect, and 
introduce the metavariable F to range over effects. It is also convenient to intro- 
duce metavariables G, H to range over finite sets of name groups. The following 
table summarizes these metavariable conventions and our enhanced syntax for 
types: 



Group Sets and Types: 



'g,H::={Gi,...,G4 


1 

finite set of name groups 


w ■-■-= 


message type 


GIF] 


ambient name in group G (contains processes 
with F effects) 


CaplF] 


capability (unleashes F effects) 


F ■-■-= 


effect 


°H,T 


may open H, may exchange T 


S,T::= 


exchange type 


Shh 


no exchange 


WiX---xWk 

1 


tuple exchange 

1 


The following tables define the type system in detail. There are five basic 
judgments as before. They have the same format except that the judgment E h 
F, meaning that the effect F is good given environment E, replaces the previous 
judgment E\- T - We omit the three rules for deriving good environments; they 
are exactly as in the previous section. There are two main differences between 
the other rules below and the rules of the previous section. First, effects, F, 


replace exchange types. 


T, throughout. Second, in the rule ascribing a type to 


open n, the condition G G H constrains the opening effect H of the capability 
open n to include the group G, the group of the name n. 


Judgments: 




A ho 


1 

good environment 


E'rW 


good message type W 


E'r F 


good effect F 


E'r M --W 


good expression M of message type W 


E'r P--F 


good process P with F effects 

1 
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Typing Rules: 

'g e dom{E) E'^ F EV~ F H C dom{E) E \- o ' 

E h G[F] E h Cap[F] E h °H, Shh 

H C dom{E) E^Wi ■■■ E^Wk E',n:W,E”ho 

E^°H,WiX---xWk E',n:W,E” ^ n:W 



EhCap[F] EhM 


: Cap[F] 


EhM': Cap[F] 


Eh e : Cap[F] 


E h M.M' 


’ : Cap[F] 


Ehn: G[F] Eh°H,T 


Ehn: 


G[F] Eh°H,T 


Eh inn: Gop[°H,T] 


Eh out n: Cap[°H,T] 


Ehn: G[°H, T] G G H 


EhM 


: Cap[F] EhP:F 


Eh openn: Cap[°B.,T] 




E h M.P : F 


EhM: G[F] Eh P : F 


Eh F' 


E,n:G[F] h P : F' 


E h M[P] : F' 




E h {nn:G[F])P : F' 


E,GhP:F Gifg{F) 


Eh F Eh P : F 


E h {vG)P : F 


EhO: 


F Eh\P :F 


Eh P:F EhQ:F 


E, ni'.Wi. 


. . . . .nk-.Wuh P :°n,Wi X ■ ■ ■ xWu 



EhP\Q:F Eh {np.Wi, nk:Wk).P : °H,Wi x ■ ■ ■ x Wk 

Eh Ml-. Wi ■■■ Eh Mk-.Wk H C dom{E) 



Eh{Mi,...,Mk):°n,WiX---x Wk 

Eh N : Cap[°H, T] Eh M : G[F] E h P : F E h F' 

Eh go N.M[P] : F' 

I I 

Theorem 2. If E h P : F and either P = Q or P ^ Q then there are G\, . . . , 
Gk such that Gi, . . . , Gk, E h Q : F. 

Here is a simple example of a typing derivable in this system: 

G, n:G[°{G}, Shh] h n[0] | open n.O : °{G}, Shh 

This asserts that the whole process n[0] | open n.O is well-typed and opens only 
ambients in the group G. 

On the other hand, one might expect the following variant to be derivable, 
but it is not: 

G, n:G[°0, Shh] \/ n[0] | open n.O : °{G}, Shh 

This is because the typing rule for open n requires the effect unleashed by the 
open n capability to be the same as the effect contained within the ambient n. 
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But the opening effect °0 specified by the type G[°0, Shh] of n cannot be the 
same as the effect unleashed by open n, because the rule also requires the latter 
to at least include the group G of n. 

We have not found this feature to be problematic, and indeed it has a positive 
side-effect: the type G[°G, T] of an ambient name n not only tells which opening 
effects may happen inside the ambient, but also tells whether n may be opened 
from outside: it is openable only if G S G, since this is the only case when 
open n.O | n\P] can be well typed. Hence, the presence of G in the set G may 
either mean that n is meant to be an ambient within which other ambients in 
group G may be opened, or that it is meant to be an openable ambient. 



4 Crossing Control 



This section presents the third and final type system of the paper, obtained by 
enriching the type system of Section 0with attributes to control mobility. 

Movement operators enable an ambient n to cross the boundary of another 
ambient m either by entering it via an in m capability or by exiting it via 
an out m capability. In the type system of this section, the type of n lists 
those groups that may be crossed; the ambient n may only cross the boundary 
of another ambient m if the group of m is included in this list. In our typed 
calculus, there are two kinds of movement, subjective moves and objective moves. 
Therefore, we separately list those groups that may be crossed by objective moves 
and those groups that may be crossed by subjective moves. 

We add new attributes to the syntax of ambient types, effects, and capability 
types. An ambient type acquires the form G T]. An ambient of this 

type is in group G, may cross ambients in groups G' by objective moves, may 
cross ambients in groups G by subjective moves, may open ambients in groups 
H, and may contain exchanges of type T. An effect, F, of a process is now of the 
form ^X5,°H,T. It asserts that the process may exercise in and out capabilities 
to accomplish subjective moves across ambients in groups G, that the process 
may open ambients in groups H, and that the process may exchange messages 
of type T. Finally, a capability type retains the form Cap[F\, but with the new 
interpretation of F. Exercising a capability of this type may unleash F effects. 



Types: 

W ::= 

G^[F] 

CaplF] 

F ::= 

^G,°H,T 

S,T::= 

Shh 

WiX---xWk 



message type 

ambient name in group G, crosses G objec- 
tively, contains processes with F effects 
capability (unleashes F effects) 
effect 

crosses G, opens H, exchanges T 
exchange type 
no exchange 
tuple exchange 
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The format of the five judgments making up the system is the same as in 
Section 0 We omit the three rules defining good environments; they are as in 
Section 0 There are two main changes to the previous system to control mobility. 
First, the rules for typing in n and out n change to assign a type C'ap[^X5,°H, T] 
to the capabilities in n and out n only if G S G where G is the group of 
n. Second, the rule for objective moves changes to allow an objective move of 
an ambient of type G^X5'[F’] by a capability of type Cap[^G°ll,T] only if 
G = G'. 

Typing Rules: 

'g g domjE) G C dom{E) E \- F E \- F ' 

Eh G^G[F] Eh Cap[F] 

G C dom{E) H C dom{E) E h o 
E h ^G,°H, Shh 

G C domjE) H C domjE) EhWi ■■■ EhWk 
E h ^,°H, Wi X • • • X Wfe 



E',n:W,E” ho E h Cap[F] Eh M ■. Cap[F] Eh M' : Cap[F] 

E', n:W, E” hn-.W Eh e\ Cap[F\ E h M.M' : Cap[F\ 

Ehn:G^G'[F] Eh^G,°H,T GgG 
Eh inn: Gap["X5,°H, T] 

Ehn:G^G'[F] F;F^G,°H,T GgG 
Eh out n: Cap[^°U,T] 



Ehn: G^G'[^G,°H, T] G € H 
Eh open n: Cap[^G°H,T] 



Eh M : Cap[F] Eh P : F 
E h M.P : F 



Eh M -.G^GIF] Eh P : F Eh F' 
E h M[P\ : F' 



E,n:G^[F] h P : F' 
E h {vn:G^G[F])P : F' 



E,GhP:F GifgjF) Eh F 

E h {vG)P : F EhO: F 



Eh P: F 
Eh\P : F 



Eh P:F EhQ:F 
Eh P\Q: F 



E,m:Wi,..., Uk-.Wk b P : ^G,°H, Wi x ■ ■ ■ x Wfc 
E h {m-.Wi, nk:Wk).P : ""G,°H, VFi x • • • x 



Eh Mi: Wi ■ ■ ■ Eh Mk-.Wk G C domjE) H C domjE) 

Eh {Mi,...,Mk) : ^,°H, WiX---xWk 

Eh N : Gap[^,°H, T] Eh M : G^G[i^] Eh P:F E h F' 

Eh go N.M[P] : F' 

I I 
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Theorem 3. If E \- P : F and either P = Q or P ^ Q then there are G\, . . . , 
Gk such that Gi, . . . ,Gk, E \- Q : F. 

Recall the untyped example from Section li .1 1 Consider two groups G and H. 
Let W = G^0[^0,°0, Shh] and set P to be the example process: 

P = a[p[out a.in &-(c)]] | b[open p.{x\W).x\f\ 

Let E = G,H, a:W, b:G ^0[^{G},°{H}, W],c:W, p:H ^0[^{G},°{H}, IT], 
Then we can derive the typings: 

E h out a.in b.{c) : '^{G},°{H}, W 
E open p.{x:W).x[] : ^{G},°{H},W 
L; h P : ^0,°0, Shh 

From the typings a, c : G^0['~^0,°0, Shh], we can tell that ambients a and 
c are immobile ambients in which nothing is exchanged and that cannot be 
opened. From the typings p:H ^0[^{G},°{H},W],b:G ^0['^{G},°{H},W], we 
can tell that the ambients b and p cross only G ambients, open only H ambients, 
and contain W exchanges; the typing of p also tells us it can be opened. This 
is good, but is not fully satisfactory, since, if b were meant to be immobile, we 
would like to express this immobility invariant in its type. However, since b opens 
a subjectively mobile ambient, then b must be typed as if it were subjectively 
mobile itself (by the rule for open). 

As already observed in ^nnm, this problem can be solved by replacing the 
subjective moves by objective moves. Let W = G^0['^0,°0, Shh], again, and 
set Q to be the example process with objective instead of subjective moves: 

Q = a[go {out a.in b).p[(c)]] ] b[open p.{x:W).x\\] 

Let E = G, H,a:W,b:G^0[^0,°{H},W],c:W,p:H ^{G}[^0,°{H},W], and 
we can derive: 

E h out a.in b : Cap[^{G},°0, Shh] 

E\- go {out a.in b).p[{c)] : ^0,°0, Shh 
E\- open p.{x:W).x\] : ^0,°{H},W 
E\- Q : ^0,°0, Shh 

The typings of a and c are unchanged, but the new typings of p and b are more 
informative. We can tell from the typing p:H ^{G}[^0 ,°{H} , W] that movement 
of p is now due to objective rather than subjective moves. We can now tell from 
the typing b:G^0['^0,°{H},W] that the ambient b is immobile. 

This example suggests that in some situations objective moves lead to more 
informative typings than subjective moves. Still, subjective moves are essential 
for moving ambients containing running processes. We need such ambients to 
model mobile agents, for example. 
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5 Upper Bounds on Capabilities Imposed by Effects 



Like most other type systems for concurrent calculi, ours does not guarantee 
liveness, for example, the absence of deadlocks. Still, we may regard the effect 
assigned to a process as a safety property: an upper bound on the capabilities 
that may be exercised by the process, and hence on its behavior. We formalize 
this idea in the setting of our third type system, and explain some consequences. 
A similar analysis can be applied to the simpler type system of Section 0 
We say that a process P exercises a capability M, one of in n or out n or 
open n, just if P j, M may be derived by the following rules: 

Exercising a capability: P I M where M G {in n, out n, open n} 

'p = M.Q P jM QjM PjM njfn{M) P jM 

P{M P\Q{M P\Q{M {vn:W)PiM {vG)P { M 

I I 



We begin by defining a fragment of a labelled transition system for the am- 
bient calculus Essni. We say that a process P exercises a capability M, one 
of in n or out n or open n, to leave residue P' just if the M-labelled transition 
P — > P' may be derived by the following rules: 

Labelled Transitions: P P' where M G [in n, out n, open n} 

I 1 

p = M.Q P ^ P' mj fn{M) P ^ P' 

P ~^Q {vm-.W)P {vm:W)P' {vG)P (^G)P' 




pIqAp'iq p|qAp|q' 



L 



J 



The following asserts that the group of the name contained in any capability 
exercised by a well-typed process is bounded by the effect assigned to the process. 
It is a corollary of Theorem 0 

Theorem 4 (Effect Safety). Suppose that E \~ P : ^X5,°H,P. 

(1) If P { inn then E \- n : G^X5'[P] for some type G^X5'[P] with G G G. 

(2) If P i out n then E \- n : G ^G'[P] for some type G ^G'[P] with G gG. 

(3) If P i open n then E \- n : G ^X5'[P] for some type G ^X5'[P] with G G H. 

To explain the operational significance of this theorem, consider a name 
m : P ^H'[^G,°H, T] and a well-typed ambient m[P], Suppose that m[P] is 
a subprocess of some well-typed process Q. We can show, by adapting standard 
techniques innna!, two connections between the M-labelled transitions of the 
process P and the reductions immediately derivable from the whole process Q. 

First, within Q, the ambient m[P] can cross the boundary of another ambient 
named n of some group G only if either P P' or P P' for some P' . 
The typing rule for ambients implies that P must have effect ^X5,°H, T. Part (1) 
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or (2) of the theorem implies that the set G contains G. Second, suppose that 
P includes a top-level ambient named n. The boundary of n can be dissolved 
only if P p' for some P' . Since P has effect part (3) of the 

theorem implies that the set H contains G. So the set G includes the groups of 
all ambients that can be crossed by m[P], and the set H includes the groups of 
all ambients that can be opened within m[P]. 

A corollary of Theorem^is that these bounds on ambient behavior apply not 
just to ambients contained within Q, but to ambients contained in any process 
reachable by a series of reductions from Q. 



6 Conclusions 

Our contribution is a new type system for tracking the behavior of mobile com- 
putations. We introduced the idea of a name group. A name group represents a 
collection of ambient names; ambient names belong to name groups in the same 
sense that values belong to types. We studied the properties of a new process 
operator {vG)P that lexically scopes groups. Using groups, our type system can 
impose behavioral constraints like “this ambient crosses only ambients in one set 
of groups, and only dissolves ambients in another set of groups”. Our previous 
type system for mobility IklGGhhl cannot express such constraints. 

In the extended version of this paper jOGGflfl) . we revisit an encoding of a 
distributed programming language that we first reported in the technical report 
version of our earlier work |CGG99| . In the encoding, ambients model both net- 
work nodes and the threads that may migrate between the nodes. The encoding 
can be typed in all three of the systems presented in this paper. The encoding 
illustrates how ambient groups can be used to partition the set of ambient names 
according to their intended usage, and how opening and crossing control allows 
the programmer to state some of those programming invariants which are the 
most interesting when programming mobile computation. For example, the ty- 
ping allows threads to cross node boundaries, but not mistakenly the other way 
round, and guarantees that neither threads nor nodes may be opened. We use 
{vG) to make fresh groups for certain synchronization ambients in the encoding. 
The benefit of {vG) is that we can be statically assured that these synchroniza- 
tion ambients are known only to the processes we intend to synchronize, and 
propagate no further. 

Our groups are similar to the sorts used as static classifications of names in 
the TT-calculus Our basic system of Section El is comparable to Milner’s 

sort system for tt, except that a new sort operator does not seem to have been 
considered in the 7r-calculus literature. Another difference is that sorts in the 
TT-calculus are mutually recursive; we would have to add a recursion operator 
to achieve a similar effect. Our systems of Sections 0 and 0 depend on groups 
to constrain the opening and crossing behavior of processes. We are not aware 
of any uses of Milner’s sorts to control process behavior beyond controlling the 
sorts of communicated names. 
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Apart from Milner’s sorts, other static classifications of names occur in de- 
rivatives of the TT-calculus. We mention two examples. In the type system of 
Abadi for the spi calculus, names are classified by three static secu- 

rity levels — Public, Secret, and Any — to prevent insecure information flows. In 
the flow analysis of Bodei, Degano, Nielson, and Nielson pBDlNlNhR) for the tt- 
calculus, names are classified by static channels and binders, again with the 
purpose of establishing security properties. (A similar flow analysis now exists 
for the ambient calculus Although there is a similarity between these 

notions and groups, and indeed to sorts, nothing akin to our {vG) operator ap- 
pears to have been studied. 

There is a connection between name groups and the region variables in the 
work of Tofte and Talpin !TTT7j on region-based implementation of the A- 
calculus. The store is split into a set of stack-allocated regions, and the type 
of each stored value is labelled with the region in which the value is stored. The 
scoping construct letregion p in e allocates a fresh region, binds it to the region 
variable p, evaluates e, and on completion, deallocates the region bound to p. 
The constructs letregion p in e and (vG)P are similar in that they confer static 
scopes on the region variable p and the group G, respectively. One difference is 
that in our operational semantics {vG)P is simply a scoping construct; it allo- 
cates no storage. Another is that scope extrusion laws do not seem to have been 
explicitly investigated for letregion. Still, we can interpret letregion in terms of 
{vG), and intend to report this in a future paper. 

Levi and Sangiorgi’s type system for a generalization of the ambient calcu- 
lus can guarantee immobility and single-threadedness. It would be inte- 

resting to consider extensions of their type system with groups. 
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An Asynchronous, Distributed 
Implementation of Mobile Ambients 



Cedric Fournet^, Jean-Jacques Levy^, and Alan Schmitt^ 

^ Microsoft Research 
^ INRIA Rocquencourt 



Abstract We present a first distributed implementation of the Cardelli- 
Gordon’s ambient calculus. We use Jocaml as an implementation lan- 
guage and we present a formal translation of Ambients into the dis- 
tributed join calculus, the process calculus associated with Jocaml. We 
prove the correctness of the translation. 



1 Introduction 

We present a highly concurrent distributed implementation of the Cardelli- 
Gordons’s calculus of Mobile Ambients 0 in Jocaml pi ,'Jlbj . The ambient calculus 
is a simple and very esthetic model for distributed mobile computing. However, 
until now, it did not have a distributed implementation. Such an implementation 
may seem easy to build, especially with a language with distribution and strong 
migration (Jocaml), but we encountered several difficulties and design choices. 

Ambients are nested. Their dynamics is defined by three atomic steps: an 
ambient may move into a sibling ambient (In), it may move out of its parent 
ambient (Out), or it may open one of its child ambients (Open). Each atomic 
migration step may involve several ambients, possibly on different sites. For 
instance, the source and destination ambients participate to an iN-step; similarly 
the source and parent ambients take part to an OuT-step; the target ambient 
participates to an OPEN-step. Each atomic step of the ambients calculus can be 
decomposed in two parts: checking whether the local structure of the ambient 
tree enables the step, and actually performing the migration. 

The first part imposes some distributed synchronization. One may use a 
global synchronous primitive existing at the operating system or networking 
level, but such a solution is unrealistic in large-scale networks. A first range 
of solutions can be designed by considering locks and critical sections in order 
to serialize the implementation of atomic steps. For instance, the two ambients 
participating to a reduction step are temporary locked, and the locks are released 
at the end of the step. However this solution cannot be symmetric, in the same 
way as there is no symmetric distributed solution to the Dining Philosophers 
problem. Some ambients have to be distinguished, for instance, one ambient 
could be the synchronizer of all ambients. Naturally, the nested structure of 
ambients can be used, for instance each ambient controls the synchronization of 
its direct subambients. In both cases, one has to be careful to avoid deadlocks or 
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too much serialization. This solution would be similar to Cardelli’s centralized 
implementation of an earlier variant of the ambient calculus in Java m- One 
advantage of a serialized solution is the ease of the correctness proof of the 
implementation. On the negative side, each attempt to perform a step takes 
several locks higher up in the ambient hierarchy; these locks may be located 
at remote sites, leading to long delays before these locks are released for other 
local steps. Moreover, due to the mobility discipline of the ambient calculus, an 
ambient that migrates from one point to another in the ambient hierarchy has 
to travel through an ambient enclosing both the origin and the destination, thus 
inducing global bottlenecks. 

A different set of solutions is fully asynchronous. Atomic steps of ambients 
are decomposed into several elementary steps, each involving only local syn- 
chronization. In this approach, each ambient step is implemented as a run of 
a protocol involving several messages. Concurrency is higher, as only the mov- 
ing ambient might not be available for other reduction steps. For instance, our 
solution never blocks steps involving parents of a moving ambient. The imple- 
mentation of migration towards a mobile target may be problematic, but can be 
handled independently of the implementation of ambient synchronization, e.g., 
using a forwarding mechanism. In our case, we simply rely on the strong mi- 
gration primitive of Jocaml. On the negative side, the correctness proof is more 
involved. 

In this paper, we present an asynchronous distributed algorithm for imple- 
menting ambients, we make it precise as a translation into the join calculus — the 
process calculus that is a model of Jocaml 0, and we refine this translation into 
a distributed implementation of ambients written in Jocaml. The algorithm pro- 
vides an insight into the implementability of ambients. The Jocaml prototype 
is a first, lightweight, distributed implementation of ambients. The translation 
is proved correct in two stages: first we use barbed coupled simulations for the 
correctness of the algorithm, then we use an hybrid barbed bisimulation for the 
actual translation into the join calculus. Technically, the first stage is a first 
application of coupled-simulations ca in a reduction-based, asynchronous set- 
ting; it relies on the introduction of an auxiliary ambient calculus extended with 
transient states; it does not depend of the target language. The second stage is 
a challenging application of the decreasing diagram technique m- In combina- 
tion, these results imply that the translation preserves and reflects a variety of 
global observation predicates. 

The paper is organized as follows. In section 0, we present the asynchronous 
algorithm and we show a formal translation from ambient processes to join 
calculus processes. In section 0 we discuss the correctness of the translation in 
terms of observations. In section 0 we focus on the operational correspondence 
between a process and its translation; to this end, we refine the ambient calculus 
to express additional transient states induced by the translation. In section 0 we 
state our main technical results and give an idea of their proofs. In section 0 we 
describe more practical aspects of the Jocaml implementation. We conclude in 
section 0 In an appendix, we recall the operational semantics of the distributed 
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join calculus and of the calculus of mobile ambients, and we give an overview of 
both calculi. Additional discussions, technical details, and all the proofs appear 
in the full version of this paper m- 

2 Prom Ambients to the Join Calculus 

We describe the asynchronous algorithm, then we specify it as a translation 
from ambients to the join calculus. We begin with a fragment of the ambient 
calculus given by the grammar P ::= a[P] | in a.P | out a.P | P | P' | 0. 
In a second stage, we incorporate Open steps and other ambient constructs. 



2.1 An Asynchronous Algorithm 

The dynamic tree structure of ambients is represented by a doubly linked tree. 
Each node in the tree implements an ambient: each node contains non-ambient 
processes such as in b.P or out a.c[Q\ running in parallel; each node also hosts 
an ambient manager that controls the steps performed in this ambient and in its 
direct subambients. Different nodes may be running at different physical sites, so 
their ambient managers should use only asynchronous messages to communicate 
with one another. Since several ambients may have the same name, each node 
is also associated with a unique identifier. (Informally, we still refer to ambients 
by name, rather than unique identifier.) 

Each ambient points to its subambients and to its parent ambient. The down 
links are used for controlling subambients, the up link is used for proposing new 
actions. The parent of the moving ambient for an iN-step knows the destination 
ambient; the parent also knows the destination ambient — its own parent — for an 
OuT-step; it controls the opened ambient for an OPEN-step. Hence, the decision 
to perform a step will always be taken by the parent of the affected ambient. 

Moves of ambient a in and out of ambient b correspond to three succes- 
sive steps, depicted below. Single arrows represent current links; double arrows 
represent messages in transit. 
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We detail the dynamics of an iN-step, e.g., c[ a[ in 6. Q ] | c[6[a[Q]]]. 
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0- step: initially, a delegates the migration request In b to its current parent 

(here c); to this end, it uses its current up link to send a message to c saying 
that a is willing to move into an ambient named b. 

1- step: the enclosing ambient c matches a’s request with a’s and b’s down links. 

Atomically, a’s request and the down link to a are erased, and a relocation 
message is sent to a; this message contains the address of b, so that a will be 
able to relocate to b, and also a descriptor of a’s successful action, so that a 
can complete this step by triggering its guarded process Q. 

2- step: the moving ambient a receives c’s relocation message, relocates to b’s 

site, and updates its up link to point to b. It also sends a message to b that 
eventually registers a as a subambient of b, establishing the new downlink. 

The 1-step may preempt other actions delegated by a to its former parent c. 
Such actions should now be delegated to its new parent b. For that purpose, a’s 
ambient manager keeps a log of the pending actions delegated in 0-steps, and, 
as it completes one of these actions in a 2-step, it re-delegates all other actions 
towards its new parent. The log cannot be maintained by the parent, because 
delegation messages may arrive long after a’s departure. Moreover, in the case an 
ambient moves back into a former parent, former delegation messages may still 
arrive, and should not be confused with fresh ones. Such stale messages must be 
deleted. This is not directly possible in an asynchronous world, but equivalently 
each migration results in a modification of the unique identifier of the moving 
ambient, each delegation message is tagged with the current identifier, and the 
parent discards every message with an old identifier. 

An OuT-step of a out of b corresponds to the same series of three steps. The 
main different is in step 1, as the enclosing ambient b matches a’s request with 
a’s down link and its own name b, and passes its own up link in the relocation 
message sent back to a. 

2.2 A Simple Translation 

The compositional translation | • appears in Figure E Overall, the tree of 
nested ambients is mapped to an isomorphic tree of nested locations. Each am- 
bient is mapped to a join calculus location containing the definition D of the 
channel names that form the ambient interface, and containing processes that 
represent the ambient state. The definition D is composed of three groups of rules 
Dq, Di, and D 2 that respectively implement 0, 1, and 2-steps of the algorithm. 

To represent the distributed data structure used in the algorithm of sec- 
tion o an ambient is represented by an interface e, which is a record that 
contains fields here, amb, subin, subout, reloc, in, and out. The /lere-field is the 
name of the location hosting the ambient, whereas the other fields are channel 
names used to interact with this ambient. The translation is parameterized by 
the interface e of the current enclosing ambient. A down link to a subambient 
named b with interface eb and unique identifier (uid) j is represented as a message 
amb{j, b, eb). For every ambient, the up link to its parent ambient is represented 
by the parent interface e, which is stored in the state message s{a,i,e,l). In 
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I«[-P]L ^ def AMa,e(-P) in 0 
[in a.Pjg = def k() > [PJ^ in e.in{a, k) 

[out a.PJg = def «:()>[P]|g in e.out{a, k) 

IP I QL = I [QL 

loL = 0 

where the ambient manager AMa,e{P) is defined as: 

Do s{a,i,e,l) \ in{b,K) >s{a,i,e,l U {In b k}) | e.subin{i,b, k) 

A s(a, i, e, 1) \ out{b, k) >s(a, (Out b k}) | e.subout{i, b, k) 

Di s{a,i,e,l) \ amb{j,b,eb) \ amb{k,c,ec) \ subin{k,b,K)> 
s{a,i,e,l) I amb{j,b,eb) \ ec.reloc{eb, k) 

A s{a,i,e,l) \ amb{j,b,eb) \ subout{j, a, k)> s{a,i, ej) I eb-reloc{e, k) 

D 2 '= s{a,i,e,l) I reloc{e ,k)> go{e .here)-, {la, \ k{) \ Flush{l,in,out, k)) 

D = DoADiA D 2 



Ia,eh,e = def uid i in s{a, i, e, 0) | e.amb{i, a, eh) 



AMa,e{P) = here [d : Ia,a^,a I [PL^ 



with the record notation eh = 



here = here, amb = amb, subin = subin, 
subout = subout, reloc = reloc, in = in, out = out 



Figure 1. Translation from In/Out ambient processes to the join calculus 



addition, the state message contains the name a and the current uid i of the 
ambient, and the log I of In and Out actions that have been delegated to the 
parent ambient using e. 

We resume our study of an iN-action, considering the role of each message 
in the translation of c[ a[in6.Q ] I ^[0]]- Initially, the translation defines a 
continuation k for Q and issues a message in{b, k) in a, which is a subjective 
migration request into an ambient named b. 

The 0-step consists of delegating the request to the parent ambient. Using the 
first rule of a’s Dq, the messages s(a, i, e, 1) and in{b, k) are consumed, the request 
is recorded in a’s log as an entry In b k, and the request is forwarded to the 
enclosing ambient c described by the interface e. The ambient a remains active, 
with new state s{a, i, e, I Ujln b k}). In parallel, the message e.subin{i, b, k) is a 
subambient move request sent to c, with the explicit identifier i of the requester a. 

The 1-step is performed by c’s ambient manager. The message subin{i,b, k) 
may be consumed using the first rule of Hi. The rule also requires that both the 
ambient that issued the request and another destination ambient with name b be 
actually present. This step removes the down link for the moving ambient — the 
message amb{i, a, Co) — , thus blocking other actions competing for the same mes- 
sage, whereas the destination ambient remains available for concurrent steps. A 
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[open a.PJg = def k() > [P]^ in e.open(a, k) 

[ML = e.send(n) 

[(n).P]g = def K(n) > [P]g in e.recu(K-) 

[!P]^ = def kOi>[PL I k() in«() 

[i^a.PJg = def fresh a in [PL 
with additional rules in the definition of AM a,e{P)'- 

D[ s{a,i,e,l) \ amb{j,b,eb) \ open{b, k) t> s{a,i,&,l) \ eb-opening{n) 
D '2 s{a,i,e,l) I opening{K)\> f{e) \ k() | Flush{l,e.in,e.out,K) 

Dc "= s{a,i,e,l) \ recv(K) \ send{n)t>s{a,i,e,l) \ n{n) 



def 


/(e) 


1 in{b, k) t> /(e) 


1 e.in{b, k) 


A 


/(e) 


1 out{b, k) > /(e 


1 e.out{b, k) 


A 


/(e) 


1 open{b, k)> f{ 


e) 1 e.open{b, k) 


A 


/(e) 


1 amb{j, b, eb) > 


/(e) 1 e.amb{j,b,eb) 


A 


/(e) 


subin{k,b, K)t: 


>/(e) e.subin{k, b, k) 


A 


/(e) 


subout (k, b, k) 


>/(e) e.subout{k, b, ft) 


A 


/(e) 


1 recv(K)t> f{e) 


e.recv(K) 


A 


/(e) 


1 send{b) > /(e) 


1 e.send{b) 



D = Do A Di A D\ a D 2 A D'2 A Dc /\ Dp 
with the extended record notation 

J here = here, amb — amb, subin = subin, subout = subout,open = open, 

\ reloc = reloc, in = in, out = out, opening = opening, recv = recv, send = send 



Figure 2. Additional clauses for the full translation 



relocation message ea-reloc{eb, k) is emitted, signalling to the requesting ambi- 
ent a that it must migrate to the ambient with interface Cb, with continuation k. 

The 2-step, using a’s rule D 2 , consumes the message on reloc and the current 
state message, performs a join calculus migration to the location of the destina- 
tion ambient, then resumes the activities of a with parent interface Cb- To this 
end, the process Ia,ea,eb restores an active state: it generates a fresh uid i', issues 
a local state message s(a, i', Cb, 0) representing the up link, and sends to the new 
parent a message eb.amb{i' ,a,ea) representing the down link. (Since no down 
link will ever mention the previous uid i, previous delegation messages tagged 
with i will never match a rule of D\. In our implementation, these stale messages 
are actually discarded.) In addition, the message k() triggers the continuation. 
Finally, the process Flush{lU{ln b «:}), m, out, k) restarts any preempted actions 
appearing in the log. As defined in the appendix, this process emits a message 
in{d, k') or out{d, k') for every entry In d k' or Out d k' appearing in the log I 
and such that k' ^ k. These entries correspond to actions preempted by the 
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migration; they will be delegated to a new parent through other iterations of 
0-steps. 

Similarly, an OuT-step is performed according to the algorithm by using the 
second rule of Dq of the moving ambient, the second rule of Di of the enclosing 
ambient, and finally rule D 2 of the moving ambient. 



2.3 Dealing with Other Ambient Constructs 

The translation of Figure |2| generalizes the translation above to the full ambient 
calculus. For each additional construct, we add a clause to the compositional 
translation | • ]g. We also upgrade AMa^ei'), use a larger environment e 
with extra fields for open, opening, recv, and send. 

Values and Scopes. Ambient names are mapped to identical names in the 
join calculus. The two calculi rely on similar lexical scope disciplines, with 
scope extrusion performed by structural equivalence (rule Scope in join, 
rules R1 and R2 in ambients). Thus, it suffices to translate the creation of 
local ambient names va.P into binders fresh a with the same scope in the 
join calculus. 

Communication. Ambient communication is implemented by supplementing 
every ambient manager with a rule Dq that binds two channels send and recv 
and synchronizes message outputs and message requests. This encoding is 
much like the encoding of pi-calculus channels into the join calculus (see |^). 
Replication. Each replicated process IP is coded using a standard recursive 
encoding of infinite loops in the join calculus. 

Open. Ambient processes may dissolve ambient boundaries using the open 
capabilities. In contrast, join calculus names are statically attached to their 
defining location, and location boundaries never disappear. We thus lose the 
one-to-one mapping from ambients to locations, and distinguish two states 
for each location: either the ambient is still running and the message s is 
present, or it has been opened and henceforth messages sent to its interface 
are forwarded to the enclosing ambient. The indirection is achieved by using 
a persistent message sent on / defined in Dp. This leads to complications in 
the proofs, as one must prove that these opened locations do not interfere 
with the rest of the translation. 

3 Correctness of the Translation 

The distributed synchronization algorithm seems to depart from the operational 
semantics of ambients, and the translation of nested ambients yields arbitrarily 
large terms with numerous instances of the algorithm running in parallel. This 
makes the correctness of the distributed implementation problematic. Techni- 
cally, both calculi have a reduction-based semantics, which can be equipped with 
standard notions of observation. This provides a precise setting for establishing 
correctness on the translation, rather than on an abstraction of the algorithm. 
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(Of course, there are still minor discrepancies between the translation and the 
actual code in Jocaml; see section El) 

We first define a syntactic notion of observation. For each calculus, we use a 
family of predicates on processes P indexed by names b, written P 

— In the ambient calculus, P when b is free in P and P = iyv.{ b[Q] | i?). 

— In the join calculus, P when b is free in P and P = a[D' : b{v) \ P'] A D. 

Next, we express the correctness of the translation in terms of the following 
predicates, for both ambient and join processes: 

— A process P has a weak barb on b (written P l|f,) when P — >■* P' 

— A process P diverges (written P j)') when P has an infinite series of steps. 

— A process P has a fair-must barb on b (written DP JJ-t,) when for all P' such 

that P — >■* P', we have P' 

In combination, these predicates give a precise content to the informal notion 
of correctness: “the translation should neither suppress existing behaviors, nor 
introduce additional behaviors.” The minimal notion of correctness for an im- 
plementation is the reflection of weak barbs, which rules out spurious behaviors; 
the converse direction states that the implementation does not discard potential 
behaviors; for instance, it rules out new deadlocks, or even an empty transla- 
tion. The preservation of convergence is of pragmatic importance. In addition, 
correctness for fair- must tests relates infinite computations 0, and rules out 
implementations with restrictive scheduling policies. 

Since top-level ambients are not translated into messages on free names, the 
observation of translated ambients requires some special care. To this end, we 
supplement the translation at top-level with a definition that reveals the 
presence of a particular barb in the source process. Using D and et as defined 
in figure El and for a given interface e, we define the top-level translation 

|p]t) <^t A A A uid i : s(a, i, e, 0) | t(b) | |P]e,,] 

A = s(a, i, e, 1) I amb{j, b, Cb) \ t{b) \> s(a, i, e, 1) \ amb{j, 6, Cb) \ yesf) 

Without loss of generality, we always assume that the names in a, i, t, e, yes 
do not clash with any name free in P, and that the location and channel names 
introduced by the translation do not clash with any name of P. 

We are now ready to state that all the derived observations discussed above 
are preserved and reflected by the translation: 

Theorem 1. For every ambient process P and name h, we have P JJ-b if and only 
*/ ^yesi P if and only if IP]** f|'; and DP jj.;, if and only if DlP]^ U-yes- 

While correctness is naturally expressed in terms of observations along the 
reduction traces of processes, its proof is challenging. In particular, a direct 
approach clearly leads to intractable inductions on both the syntax of the source 
process and on the series of reductions. 
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4 A Calculus of Ambients Extended with Transient 
States 



In order to prove theorem Q we introduce an intermediate calculus of ambi- 
ents with constructs that materialize the key transient states of the algorithm of 
section EH and we equip this calculus with a reduction semantics in direct cor- 
respondence with the algorithm. For instance, atomic In steps are decomposed 
into series of 1- and 2-steps. (However, 0-steps are not represented, inasmuch 
as requests are always eventually delegated to the current parent.) In the next 
section, we rely on the extended calculus to establish correctness as the composi- 
tion of two equivalences. First, we use coupled simulations HH to relate the two 
semantics for ambients; then, we use bisimulations to relate ambients equipped 
with the extended semantics to their join calculus translations. 

The grammar for the extended calculus appears in Figure 0 It has new 
processes representing ambients that are committed to move or to be opened, 
as the result of their father’s 1-step — we call such transient ambients stubs — and 
also new processes marking the future position of migrating ambients — we call 
such precursors scions. Pairs of stubs and scions are syntactically connected by 
a marker i. The extended operational semantics appears in Figure 0 It is a 
reduction-based semantics with auxiliary labels for stubs and scions. Each of the 
reduction steps In, Out, and Open is decomposed in two steps. Initial steps — >-i 
introduce stubs and scions; completion steps — >-2 consume them. The reduction 
rules for Recv and Repl are those of the original semantics, except that we write 
-^c instead of — >■. Overall, we obtain a reduction system for extended ambient 
processes with steps — ti 2 C U —>-2 U — >-c- For the sake of comparison, we 

also extend the original reduction semantics and the observation predicates from 
ambients to extended ambients. 

Initially, stubs and scions are neighbors, but they may drift apart as the result 
of other steps before performing the matching completion step, so an auxiliary 
labeled transition system is used to match stubs and scions. The completion of 
a deferred migration is thus rendered as a global communication step between 
processes residing in two different ambient contexts, and syntactically linked by 
the name i. This may seem difficult to implement, but actually scions have no 
operational contents; they represent the passive target for a strong migration. 
To illustrate the extended calculus, we give below the reductions for a process 
with a critical pair. Processes Pi are regular ambient processes; processes Qi are 
transient processes reflecting intermediate states of our algorithm. 



: a] 



] |i) 

tel 



Q2 = open a \ a[ i-b[ ] ] | i) 

Dpen 1 



Open 1 
Open 2 
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: = 


extended ambient process 




all the construetors of Pigure 0 


1 Xn[P] 


stub 


* 


scion 


1 oi.P 


marker restriction 


:= 


state extension 


i{P}- 


the stub is committed to move i 


1 o{P}~ 


the stub is being opened 



Well-formed conditions: stubs and scions may occur only in extended evaluation con- 
texts; restricted markers i must be used linearly (exactly one stub and one scion). In 
the following, we write X^n[P] for either Xn[P] or n[P], 

Figure 3. Syntax for an ambient calculus extended with transient states 



Extended evaluation contexts E{-) are defined by the grammar 

E{-):-.= ■ I P\E{-) I E{-)\P I X=n[E{-)] \ vn.E{-) \ vi.E{-) 

Structural equivalence = is the smallest equivalence relation on processes that is closed 
by application of extended evaluation contexts and by a-conversion, and that satisfies 
the axioms PO, PI, P2, CO, Cl, Rl, R2 of figure 0 and: 

m ^ n m is not free in X 

p>2X ' 

Xn[vm.P] = vm.Xn[P] 

Labeled transitions are the smallest families of relations closed by application of 
restriction-free extended evaluation contexts and such that 



Stub i{Q}-n\R] * o Scion i ^ P 



Original reduction steps — >■ are defined as in FigureElwith extended evaluation contexts. 
Initial steps — >-i, Completion steps —>- 2 , and Other steps — are the smallest relations 
closed by structural equivalence, by application of extended evaluation contexts, and 
such that: 



In 1 



m[P] I n[in m.Q \ R] 

>•1 ui. m[i I P] I i{Q}-n[R] 

Move 2 



Out 1 



JD pf 



X m[P I n[out m.Q \ P]] 
—>•1 ui. i \ X^m[P I i{(3}-n[P]] 

Q^Q' 



m.{P I Q) ^2 P' I Q' 

Open 1 open n.Q \ n[P] — o{Q}-n[R] Open 2 o{Q}-n[P] — >-2 Q \ R 
Recv (n) \ {x).P P{%} Repl \P P \'-P 



Figure 4. Semantics for an ambient calculus extended with transient states 
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5 Coupled Simulations and Operational Correspondence 

We continue our study of correctness in terms of equivalences based upon weak 
barbs. These equivalences are essential to obtain a modular proof. As a side 
benefit, they also provide a finer account of correctness. (See 0 for a discussion 
of equivalences and encodings in process calculi.) Instead of equivalences, we ac- 
tually often use relations ranging over different domains, equipped with different 
notions of reduction steps — >-a, — tb and families of observations i\-a,x and 

Definition 1 (Barbed bisimulations). A relation TZ G Va x Vb is a weak 
barbed simulation when, for all P TZ Q, we have (1) if P — >■* P' , then there 
exists Q' such that Q Q' and P' TZ Q' ; (2) for all x, if P then Q JJ-b,a;- 
TZ is a barbed bisimulation when TZ and its converse TZ~^ are barbed simulations. 

Bisimulations come with effective proof techniques that consider only a few 
steps at a time, rather than whole execution traces. Unfortunately, barbed bisim- 
ilarity « — the largest barbed bisimulation closed by application of evaluation 
contexts — is too discriminating for our protocol. Transient processes such as Qi 
in the example above account for a partially-committed internal choice: Qi may 
reduce to Pi and P 3 , but not to P 2 - In some contexts, they are not bisimilar 
to any derivative of P in the original ambient semantics. To address this issue 
of gradual commitment, Parrow and Sjodin proposed coarser relations called 
coupled simulations H2CS|. We liberally adapt their definition to ambients: 

Definition 2 (Coupled simulations). The relations ^ G T’a x T’b and ^ G 
Pb X T’a form a pair of barbed coupled simulations when ^ and ^ are barbed 
simulations that meet the coupling conditions: (1) if P ^ Q, then Q P; (2) 
ifQ^P, then P — Q. 

The discrepancy between ^ and ^ is most useful for handling transient states 
such as Qi- The coupling conditions guarantee that every transient state can be 
mapped both to a less advanced state and to a more advanced one. In our case, 
we would have Qi ^ P and Pi ^ Qi for i = 1, 3. 

Correctness of the Asynchronous Algorithm The first stage of our correctness ar- 
gument is expressed as coupled simulations between ambient processes equipped 
with the original and the extended semantics. In the statement below, related 
processes have the same syntax, but live in different calculi, equipped with dif- 
ferent reduction semantics. 

Theorem 2. Let ^ be the union 0 / ^ fl ^ for all barbed coupled simulations 
between ambient and extended ambient processes that are closed by application 
of evaluation contexts. For all ambient processes P, we have P ^ P. 

The proof appears in nm; it makes apparent some subtleties of our algorithm due 
to additional concurrency. After a first series of results on partial commutation 
properties for extended steps, the key lemmas establish that, for any ambient 
process P and extended ambient process Q, if P — t* 2 (y Q: then (1) for some 
ambient process P' we have Q — t* 2 (y P' and (2) for any such process P' , we also 
have P — >■* P' in the original semantics. 
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Operational Correspondence The second stage of the proof relates ambients 
equipped with the extended semantics to their join calculus translations. It is 
simpler than the first one, in principle, but its proof is complicated because the 
translation makes explicit many details of the implementation that are inessen- 
tial to the algorithm. 

In order to express the correspondence of observations across the translation, 
we supplement the top-level translation |-]^ of theorem^with an external choice 
of the ambient barb to be tested. With the same notations, we write | • ]* for 
the translation that maps every process P to the process \D /\ Dt f\ uid i : 
s(a,i,e,0) | p(t) | iT’Jgg]. As before, we assume that names in a, i, e, p, t, and 
yes do not clash with names free in P. At any point, we can use the evaluation 
context Tb{-) hT[p(t)>t{b) : 0] A (•) to test a translated ambient barb on b by 
testing the plain join calculus barb T{,(-) We have |P]^ ~ rt,(|P]*) in the 

join calculus. 

Theorem 3 (Correctness of the translation). Let be the largest bisim- 
ulation between extended ambient processes with reductions — >-i 2 C md join pro- 
cesses such that Q R implies Q (lb ijf Th{R) (lyes- For all ambient pro- 
cesses P, we have P 

In the long version of the paper m, a more precise strong bisimulation up to 
bookkeeping result is proved for the translations of all reachable extended ambient 
processes. Since every significant transient state induced by the translation has 
been lifted to the extended ambient calculus, its proof essentially amounts to 
an operational correspondence between the two calculi. We partition reductions 
in the join calculus according to the static rule of the translation being used. 
For instance, we let — >-i steps in the join calculus be the steps using a rule 
of a definition Di A D[ of figure |5| these steps create a reloc or an opening 
message, and are in operational correspondence with source — steps. We obtain 
two main classes of join calculus steps: steps ~^i 2 C that can be traced back to 
extended ambient steps, and “bookkeeping” steps — ts, which are auxiliary steps 
used to trigger continuations, migrate, manage the logs, or unfold new ambient 
managers. 

The main lemmas describe dynamic simplifications of derivatives of the trans- 
lation, which are required to obtain translations of derivatives in the extended 
source calculus. These lemmas are expressed as elementary commutation dia- 
grams between simplification relations and families of reduction steps. For in- 
stance, one lemma states that “stale messages” can be discarded; another, more 
complex lemma states that locations and ambient managers representing opened 
ambients can be eliminated, effectively merging the contents of opened ambients 
with the contents of their previously-enclosing ambient. To conclude, we ex- 
hibit a bisimulation relation between extended ambients equipped with steps 
~^i 2 C global translations of these extended ambients equipped with steps 
— The proof is structured using the decreasing diagram technique 
of [I Hj . whose conditions guarantee that every weak simulation diagram in the 
final proof can be obtained by gluing previously-established diagrams. 
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6 Distributed Implementation 

In this section, we briefly describe the actual implementation in Jocaml. The 
initialization and the dynamics of distribution for ambient processes among sev- 
eral Jocaml runtimes lead to some design choice, discussed in the long version 
of the paper m- We also refer to El for the source code, setup instructions, 
and programming examples. 

Our implementation closely follows the translation given in figures 0 and I3 
Since Jocaml already provides support for mobility, local synchronization, and 
run-time distribution, our code is very compact — less than 400 lines for the 
interpreter, less than 40k in bytecode for the object flies. The main differences 
between the formal translation and the code are given below: 

— Messages in the implementation may pass names, but also arbitrary chains of 
capabilities 0, as for instance in the ambient process (in o.out b) |!(a:).a;.(a;). 

— The implementation is an interpreter that delays the translation of guarded 
processes. The implementation maintains an environment for local variables, 
and performs dynamic type checking whenever a value is used either as a 
name or as a capability. 

— The translation relies on non-linear join-patterns, which are not available in 
Jocaml. More explicitly, the implementation relies on hash tables to cache 
some messages: when a message arrives on subin, sub out, amb, or open, either 
it is immediately used in combination with the message cache, or it is added 
to the cache. 

— The formal translation of replication always yields a diverging computation 
(as in the source ambient process). More reasonably, the interpreter unfolds 
replication on demand: since every ambient reduction involves at most two 
copies of a replicated process, it suffices to initially unfold two copies, then to 
unfold an additional copy whenever a fresh copy is used or modified. Hence, 
the process! a\ ] does not diverge, while ! o[ in a] still does. 

7 Conclusions 

We translated Mobile Ambients into the join calculus, and gave a first, asyn- 
chronous, distributed implementation of the ambient calculus in Jocaml, with a 
high level of concurrency. The synchronization mechanisms of Ambients turned 
out to be challenging first to implement, then to prove correct. This provides an 
insight into the ambient calculus as a model of concurrency. At the same time, 
this shows how Jocaml and its formal model can be used to tackle distributed 
and mobile implementations. For instance, the translation takes full advantage 
of join patterns to describe complex local synchronization steps, while a more 
traditional language would decompose these steps into explicit series of reads 
and updates protected by locks. 

In order to get a safer and more efficient implementation, one should care 
about typing information for passed values and mobility capabilities Our 
implementation insures dynamic type-checking on values, whereas it would be 
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preferable to use the static type-checking discipline of Jocaml. Similarly, static 
knowledge of the actions that never appear in a given ambient can lead to more 
efficient, specialized ambient managers. 

Finally, little is known about actual programming with Ambients, or the rel- 
evant abstractions to build a high-level language on top of the ambient calculus. 
While we did not consider changing the source language, we believe that our 
implementation provides an adequate platform for experimenting with ambient- 
based language design. For instance, our translation would easily accommodate 
the co-capabilities proposed in PI- 

Acknowledgments. This work benefited from discussions with Luca Cardelli, 
Fabrice Le Fessant, and Luc Maranget. 
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Appendix: Two Notions of Mobile Computations 

We define our syntax and semantics for the calculus of Mobile Ambients and for 
the distributed join calculus. We refer to the long version of the paper for 
an overview of the two calculi. 

Operational Semantics for Ambients Our syntax and semantics of the calculus 
of ambients are given in figures 0 and El 

This presentation slightly differs from Cardelli and Gordon’s 0 on several 
counts. In spirit, it is actually closer to the harness semantics of m- Our struc- 
tural equivalence is more restrictive; it does not introduce or remove vx binders; 
it operates only in evaluation contexts. Our operational semantics represents the 
unfolding of replication as a silent reduction step rather than a structural law. 
Also, communicated values are just ambient names, rather than both names and 
chains of capabilities. (Chains of capabilities are fully supported in the Jocaml 
implementation, but their encoding is heavy.) 

Operational Semantics for the Join Calculus Our syntax and semantics for the 
join calculus are given in figures [7| and 0 

A path a is a string of location names a, b, . . . . Active locations are locations 
not under a def ; they can be nested. The path of an active sublocation a[D : P] is 
a. a, where a is the path of its enclosing location. A configuration is a conjunction 
of top-level locations such that every location has a unique name, such that the 
set of paths for all active locations is prefix-closed except for the empty prefix 
(i.e. active locations form a tree whose nodes are indexed by location names), 
and such that every channel is defined in at most one location. In a configuration, 
a location with path a. a is folded when it is the only top-level location whose 
path contains a. 

Names can be bound either as parameters y in a message pattern or 
as names defined in D by def D in P. The definition a\D' : P] defines a and 
names defined in D' . A definition containing a rule with message pattern x{jJ) 
also defines x. 

To simplify the translation of section|2| we supplement the join calculus with 
some convenient extensions, which are easily encoded in the plain join calculus. 
We supplement definitions with new constructs uid i and fresh a that bind 
names i and a, which we use to generate unique identifiers — we could use instead 
dummy rules such as f () c> 0 . We use a record notations e = {/i = xi; . . .l„ = Xn} 
as a shortcut for a tuple of names Xi, ... ,Xn passed in a consistent order, and 
write e.li for Xi. We use an algebraic notation In b k, Out b k for log entries with 
tags In, Out and names b, k. We use finite sets of log entries, interpreted in the 
standard mathematical sense. Rather than making explicit a standard encoding 
of set iterators for implementing the process Flush{l, in, out, u), we supplement 
the operational semantics of the join calculus with a rule for flushing logs of 
messages: 

Flush Flush{l,in,out, k) in{d,n') \ out{d,u') 

In d K,' \ Out d k' \ 
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n[P] 
P I P' 
C.P 
vn.P 
(n) 
{x).P 
\P 
0 



ambient process 
ambient 

parallel composition 
guarded process 
name restriction 
asynchronous message 
message reception 
replication 
inert process 



C ::= 

in n 
I out n 
I open n 



capability 
ingoing migration 
outgoing migration 
ambient dissolution 



Figure 5. Syntax for the ambient calculus 



Evaluation contexts E{-) are defined by the grammar 

F(.)::=- I P\E{-) \ E{-) \ P \ n[E{-)] \ un.E{-) 

Structural equivalence = is the smallest equivalence relation closed by application of 
evaluation contexts, by a-conversion, and such that 



PO 




0 = P 


R1 


PI 




P' = P' 1 P 




P2 


iP 


1 P') 1 P" = P 1 (P' 1 P") 


R2 



n is not free in P 
P I vn.Q = un.{P \ Q) 
m ^ n 

m[i'n.P] = un.m[P] 



Ambient reduction — >■ is the smallest relation closed by structural equivalence, by ap- 
plication of evaluation contexts, and such that 



In 



m[P] I n[in m.Q \ R] 
m[P I n[Q I R] ] 



Out 



m[P I n[out m.Q \ P] 
m[P] I n[Q I R] 



Open open n.Q \ n[R] Q \ R 
Recv (n) \ {x).P ^ P{%} Repl !P-^P|!P 



Figure 6. Operational semantics for the ambient calculus 
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P : 




join calculus process 




0 


inert process 




\ P\P' 


parallel composition 






asynchronous message 




1 go{a);P 


migration request 




1 def D in P 


local definition 


D : 




join calculus definition 




T 


void definition 




1 dad' 


composition 




J\>p 


reaction rule 




1 a[D:P] 


sub-location (named a, running 




1 a[D : P] 


top-level location (with path a. 


J : 


:= 


join pattern 




x{y) 


message pattern 




\ J\J' 


synchronization 



D and P) 
running D and P) 



Figure 7. Syntax for the distributed join calculus 



Structural equivalence = (on both processes and definitions) is the smallest equivalence 
relation closed by application of contexts • A | • and a[ • : • ], by a-conversion on 
bound names, and such that: 



PO 


P\ 


0 = P 


DO 


DAT = D 


PI 


P\ 


P' = P' \P 


D1 


D ad' = D' ad 


P2 


{P 


1 P') \ P" = P\ {P' 1 P") 


D2 


{D A D') ad" = DA (D' A D") 


Trre 


a[a[D' : P'] ad -.P] 


names defined in D' are fresh 
Scoff, ^ ^ — 






= a.a[D' : P'] A a[D : P] 




a[D-.P\ def D' in P'] 



= a[DAD' -.P\ P'] 



Join calculus reduction -A is the smallest relation on configurations that is closed by 
structural equivalence and such that: 



Comm 



X is defined in D' a operates on message contents of J 

a[D : x(v) \ P] A I3[D' : P'] A E a[DAJ>Q : Ja \ P] A E 

a[D : P] A f3[D' : x{v) \ P'] A E ^ a[D A J>Q : Qa \ P] A E 

a folded 

a.a[D : P \ go(6); Q] A f3.b[D' ■. P'] A E 
!3.b.a[D :P\Q]A l3.h[D' ■. P'] A E 



Go 



Figure 8. Operational semantics for the distributed join calculus 
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Abstract. In our previous papers ITIlll^MI . we presented advanced type 
systems for the rr-calculus, which can guarantee deadlock-freedom in 
the sense that certain communications will eventnally succeed unless 
the whole process diverges. Although such guarantee is quite useful for 
reasoning about the behavior of concurrent programs, there still remains 
a weakness that the success of a communication cannot be completely 
guaranteed due to the problem of divergence. For example, while a server 
process that has received a request message cannot discard the request, it 
is allowed to infinitely delegate the request to other processes, causing a 
livelock. In this paper, we show that we can guarantee not only deadlock- 
freedom but also livelock-freedom, by modifying our previous type sy- 
stems for deadlock-freedom. The resulting type system guarantees that 
certain communications will eventually succeed under fair scheduling, no 
matter whether processes diverge. Moreover, it can also guarantee that 
some of those communications will succeed within a certain amount of 
time. 



1 Introduction 

It is an important and challenging task to statically guarantee the correctness 
of concurrent programs: Because concurrent programs are more complex than 
sequential programs (due to dynamic control, non-determinism, deadlock, etc.), 
it is hard for programmers to debug concurrent programs or reason about their 
behavior. Recent advanced use of the Internet is further increasing the impor- 
tance of correctness of concurrent programs: It is now becoming common that 
programs (which may be written by untrusted or malicious programmers) are 
distributed through the Internet and they run concurrently at multiple sites, 
interacting with each other. 

Unfortunately, existing concurrent/distributed programming languages or 
thread libraries provide only limited support for the correctness of concurrent 
programs. Some of them were proposed as extensions of typed functional lan- 
guages j I iolZ I ] . but their type systems do not give much information about the 
behavior of a concurrent program: for example, in CML m a term of a fun- 
ction type may not even behave like a function: it may get stuck or return 
non-deterministic results. 

J. van Leeuwen et al. (Eds.): IFIP TCS2000, LNCS 1872, pp. 365-^^^ 2000. 

@ Springer- Verlag Berlin Heidelberg 2000 
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To improve the above situation, a number of type systems have been studied 
through process calculi, just as type systems for functional languages have been 
studied through A-calculus. Our type systems rmm for the TT-calculus mm 
are among the most powerful ones. They can guarantee that every well-typed 
process is deadlock- free, in the sense that certain communications will eventually 
succeed unless they diverge. In the most recent type system m , a programmer 
just needs to declare which communications they want to succeed, and if the 
whole process is judged to be well-typed, it is indeed guaranteed that those 
communications succeed unless the process diverges. For example, in the server- 
client model, a client process can be written as (z/r) (s! [request, r] \ rl'^[reply\. P) 
in the 7r-calculus-like language of HH- (yr) creates a fresh communication chan- 
nel for receiving a reply from the server, s! [request, r] sends a request and the 
channel to the location s (which is also a channel) of a server. In parallel to 
this, rl'^[reply\. P waits for a reply from the server. The annotation c to ? in- 
dicates that a reply should eventually arrive. If the whole system of processes 
(including the server) is well typed, then a reply is indeed guaranteed to arrive 
(again, unless the whole system diverges): the server eventually receives the re- 
quest message, and sends a reply. In this way, the type system can ensure that 
a process implementing a server really behaves like a server and that a channel 
implementing a semaphore is really used as a semaphore (a process that has ac- 
quired a semaphore will eventually release it). The type systems are reasonably 
expressive: In the previous paper |2j, we have shown that our type system can gu- 
arantee deadlock-freedom of the simply-typed A-calculus with various reduction 
strategies and typical concurrent objects. Nestmann m used our type system 
to show that his encoding of the choice operator into a choice-free fragment of 
the TT-calculus does not introduce deadlock. 

Although the above deadlock-freedom property is quite useful for reasoning 
about the behavior of concurrent programs, there is still a limitation: the success 
of a communication is not completely guaranteed because a process may diverge 
before the communication succeeds. For example, while a server process that has 
received a request message cannot discard the request, it is allowed to infinitely 
delegate the request to other processes, causing a livelock. 

In this paper, we show that we can guarantee livelock-freedom as well as 
deadlock-freedom, by modifying our previous type systems for deadlock- freedom. 
The resulting type system guarantees that certain communications will eventu- 
ally succeed under fair scheduling, no matter whether processes diverge. Moreo- 
ver, it can also guarantee that some of those communications will succeed within 
a certain amount of time. Surprisingly, this modification to our previous type sy- 
stems also has a good effect on formalization: Intuitions on types become clearer, 
and the typing rules become even simpler. 

In the rest of this paper, we first introduce our target language and explain 
what we mean by deadlock and livelock in Section |21 Section 0 reviews basic 
ideas of our previous type systems for deadlock-freedom. Section 0 introduces 
our new type system for freedom from livelock and deadlock, and shows its 
soundness. Section 0 shows that with a minor modification, the type system 



Type Systems for Concurrent Processes 367 



can also guarantee that certain communications succeed within a certain time 
bound. The type system in Section 0 is rather naive and cannot guarantee the 
livelock-freedom of recursive programs. In Section|^ we show that we can recover 
the expressive power to some extent by introducing dependent types. Section 0 
discusses related work and Section ^concludes this paper. Due to lack of space, 
most proofs are omitted: they are found in an accompanying technical report 0. 
We do not discuss issues of type check or type reconstruction in this paper. We 
expect that those issues are basically similar to the case for the deadlock-free 
calculus imu, but complete type reconstruction would be unrealistic for an 
extension with dependent types. 



2 Target Language 

This section introduces the target language of our type system and defines 
deadlock and livelock. The target language is a subset of the polyadic tt- 
calculus m- We drop the matching and choice operators from the 7r-calculus, 
but keep the other operators. In particular, channels are first-class citizens as 
in the usual 7r-calculus, in the sense that they can be dynamically created and 
passed through other channels. Although this makes it difficult to guarantee 
deadlock/livelock-freedom, we believe that first-class channels are important in 
modeling modern concurrent/distributed programming languages. In fact, for 
example, in concurrent object-oriented programming an object is dynami- 
cally created and its reference is passed through messages. Because a reference 
to a concurrent object corresponds to a record of communication channels na 
Ej, channels should be first-class data. 



2.1 Syntax 

Definition 1 (processes). The set of processes is defined by the following syn- 
tax. 

P (processes) ::= 0 | x\°-[vi,.. . ,u„].P | P | (P | Q) | {vx) P 

I if V then P else Q \ *P 
V (values) ::= true \ false \ x 
a (attributes) ::= 0 | c 

Here, x and yiS range over a countably infinite set of variables. 

Notation 2. As usual, , . . . , in x? [j/i , . . . , j/„] . P and x in {vx) P are called 
bound variables. The other variables are called free variables. We assume that 
a-conversions are implicitly applied so that bound variables are always different 
from each other and from free variables. [xi H> v\,...,Xn i— t u„]P denotes a 
process obtained from P by replacing all free occurrences of xi,...,x„ with 
vi,. . . ,Vn. We write x for a sequence Xi, . . . , x„. [x i— >■ u] and {vx) abbreviate 
[xi I— >■ v\,...,Xn I— t Vn] and {vx\) ■ ■ ■ {vXn) respectively. We often omit an 
inaction 0 and write x!“[y] for x!“[y].0. When attributes are not important. 
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we omit them and just write x\[y].P and xl[y\.P for x\°‘[y\.P and xl°'[y].P 
respectively. 

0 denotes inaction. . . . , u„]. P denotes a process that sends a tuple 

[ui, . . . , Vn] on X and then (after the tuple is received by some process) behaves 
like P. An attribute a expresses the programmer’s intention and it does not 
affect the operational semantics: a = c means that the programmer wants this 
output to succeed, i.e., once the output is executed and the tuple is sent, the 
tuple is expected to be received eventually. There is no such requirement when 
a is 0. xl°'[yi , . . . , y„]. P denotes a process that receives a tuple [ui, . . . , u„] on 
X and then behaves like [yi i— >■ vi, . . . ,yn i— >■ u„]P. If a is c, then this input is 
expected by the programmer to succeed eventually. P | Q denotes a concurrent 
execution of P and Q, and {vx) P denotes a process that creates a fresh channel 
x and then behaves like P. if v then P else Q behaves like P if u is true 
and behaves like Q if u is false; otherwise it is blocked forever. *P represents 
infinitely many copies of the process P running in parallel. 

Example 3. A process *succ7[m,n,r].rl[m + n] behaves as a function server 
computing the sum of two integers. (For clarity, we extend the language with 
integers and operations on them here.) It receives a triple consisting of two 
integers and a channel, and sends the sum of the integers to the channel. A 
client process can be written like {vy) (sued [1, 2, ?/] | yl'^lx]. P). The attribute c 
of the input process declares that the result should be eventually received. 

Example 4- A binary semaphore (or lock) can be implemented by using a chan- 
nel. Basically, we can regard the presence of a value in the channel as the unlocked 
state, and the absence of a value as the locked state. Then, creation of a se- 
maphore corresponds to channel creation, followed by output of a null tuple 
{{vx) (a;![] | P)). The semaphore can be acquired by extracting a value from the 
channel (cc?[]. Q), and released by putting a value back into the channel (a:![]). 
If we want to make sure that the semaphore can be eventually acquired, we can 
annotate the input as x?'’[].Q. 

2.2 Operational Semantics 

The operational semantics is essentially the same as the standard reduction se- 
mantics of the TT-calculus m- For a subtle technical reason, we introduce a 
structural preorder instead of a structural congruence relation. The only diffe- 
rences from the usual structural congruence = are that *P | P P *P does not 
hold and that ^ is not closed under output and input prefixes and conditionals. 

Definition 5. The structural preorder ^ is the least reflexive and transitive 
relation closed under the following rules (P = Q denotes {P Q) A {Q ^ P) ): 



P|0 = P 

P\{Q\R) = {P\Q)\R 



P\Q = Q\P 

{vx) {P\Q) = {vx) P I Q{x not free in Q) 
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*P ^ *P I P 

PhQ 

(vx) P ^ (vx) Q 



PhP' QhQ' 
P\QhP'\Q' 
PhQ 
*P ^ *Q 



Now we define the reduction relation. Following the operational semantics of 
the linear 7r-calculus mil. we define the reduction relation as a ternary relation 
P — ^ Q. I expresses on which channel the reduction is performed: I is either e, 
which means that the reduction is performed by communication on an internal 
channel or by the reduction of a conditional expression, or x, which means that 
the reduction is performed by communication on the free channel x. 

Definition 6. The reduction relation — ^ is the least relation closed under the 
following rules: 

xP[v].P\ x?“' [z].Q ^ P\[z^v]Q 
P -^Q P^Q 

P I i? -4 Q I i? i^x) P -4 (ra) Q 



P^Q l^x 

(lyx) P —4 (ra) Q 



PP P' 



P' 



Q' Q' >Q 



P 



Q 



if true then P else Q —4 P if false then P else Q — > Q 

Notation 7. We write P — > Q if P —4 Q for some 1 . We write — >* for the 
reflexive and transitive closure of — >. P —4 and P — > mean 3Q.{P —4 Q) 
and 3Q.(P — )> Q) respectively. 

2.3 Deadlock-Freedom and Livelock-Preedom 

Based on the above operational semantics, we formally define deadlock-freedom 
and livelock-freedom below. Basically, we regard a process as deadlocked or li- 
velocked if one of its subprocesses is trying to communicate with some pro- 
cess but blocked forever without finding a communication partner. However, not 
every process that is blocked forever should be regarded as being in a bad state. 
For example, it should be no problem that a server process waits for a request fo- 
rever: it just means that no client process sends a request message. It is also fine 
that an output process remains forever on a channel implementing a semaphore 
(Example^, because it means that no process tries to acquire the semaphore. 
Therefore, we focus our attention on communications annotated with the attri- 
bute c. A process is considered deadlocked or livelocked if it is trying to perform 
input or output but blocked forever, and if the input or output is annotated with 
c. 



We first define deadlock. 
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Definition 8 (deadlock, deadlock- freedom). A process P is in deadlock if 
(i) P h. {x?^[z].Q\R) or P > {uy) {x\'^[z\.Q\R) and (ii) there is no P' 
such that P — > P' . A process P is deadlock- free if there exists no Q such that 
P — Q and Q is in deadlock. 

Remark 9. In the usual terminology, the deadlock often refers to a more restric- 
ted state, where processes are blocked forever because of cyclic dependencies 
on communications. As the above definition shows, in this paper, the deadlock 
refers to a state where processes are blocked forever, irrespectively of whether 
or not there are cyclic dependencies. Our definition of deadlock subsumes the 
usual, narrower definition of deadlock. 



Example 10. {vx) (cc?‘^[].0) is in deadlock because the input from x is annot- 
ated with c but there is no output process, {ux) {uy) (cc?‘^[].i/![] | y?[].a;![]) is 
also deadlocked because the input on x cannot succeed because of cyclic depen- 
dencies on communications on x and y. On the other hand, (irx) (x?[].0) and 
(vx) (a;![] |a;?‘^[].0) are not in deadlock. 

Example 11. A process {vx) (a;?‘^[].0 | {vy) (j/![x] | *yl[z\.y\[z])) is deadlock-free. 
Although the input from x never succeeds, the entire process is never blocked. 

The last example shows a weakness of the deadlock-freedom property: Even if 
a process is deadlock-free, it is not completely guaranteed that communications 
eventually succeed. The livelock-freedom property defined below requires that 
communications eventually succeed, no matter whether the process diverges. 

To give a reasonable definition of livelock, we need to assume that scheduling 
is fair, in the sense that every communication that is enabled infinitely many 
times eventually succeeds. 

Definition 12 (fair reduction sequence). A reduction sequence Pq — ^ Pi 
— > P 2 — > ■■■ is fair if the following conditions hold. 

(i) If there exists an infinite increasing sequence n\ < U 2 < ... of na- 
tural numbers such that P„. ^ {vwi) {xl°'[v].Q\x?°‘'[y].Qi\ Ri), then 

there exists n > n\ such that P„ ^ {vw) {x\°‘[v\.Q\xl°‘ [y].Q' \R') and 

{vw) (Q I [y v]Q' I R') h Pn-\-i- 

(ii) If there exists an infinite increasing sequence n\ < U 2 < ... of na- 
tural numbers such that P„^ ^ {vwi) {xl°'[y\.Q\x\°''[vi].Qi\Ri), then 

there exists n > n\ such that P„ ^ {vw) {xl°‘[y\.Q\x\°‘ [v].Q' \R') and 

{vw) {[y v]Q I Q' I R') h Pn-\-i- 

(Hi) If Pi ^ {vw) (if V then Qi else Q 2 \ R) and v = true or false for some i, 
then either the reduction sequence is finite or there exists n > i such that 
Pn h {vw) (if V then Qi else Q 2 \ R'), {vw) {Q' \ R') ^ Pn+i, and Q' = Qi 
if V = true and Q' = Q 2 otherwise. 

Note that by definition, every finite reduction sequence is fair. 
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Definition 13 (full reduction sequence). A reduction sequence Pi — P 2 
— > ■■■ is full if it is an infinite reduction sequence or a finite reduction sequence 
ending with Pn such that Pn 

Now we define the livelock-freedom property. Intuitively, a process is livelock- 
free if in any fair reduction sequence, a process trying to perform communication 
with capability annotation can eventually communicate with another process. 

Definition 14 (livelock-freedom). A process Pq is livelock-free if the follo- 
wing conditions hold for any full fair reduction sequence Pq — > P\ — > P 2 — > 



If Pi ^ (j^ic) (a:!'^['D]. Q I ^) for some i > 0, there exists n > i such that 
Pn h (vw’) {x\'"[v\.Q\ cc?“[y].Pi I P 2 ) and {uw') {Q\[y<-^ v]Ri \ R 2 ) >1 Pn+i- 
If Pi P Q I -R) for some i > 0, there exists n > i such that 

Pn h {vw') {xT[jj\. Q I xP[v\.Ri I P2) and {vw') {[y v\Q \ Ri \ R2) P Pn+i- 

Again, our definition of live lock may defer from the usual terminology. In our 
definition, livelock-freedom is a strictly stronger property than deadlock- freedom. 



Example 15. The process in Eya.mnle 1771 is deadlock-free but not livelock-free. 
A process {vx) (a;![] |a;?‘^[].0) | {vy) (y![] | *?/?[].?/![]) is livelock-free: Under the 
fairness assumption, the input from x eventually succeeds. 



3 Previous Type Systems for Deadlock-Freedom 



The basic idea of our previous type systems for deadlock-freedom mm is to 
extend ordinary channel types |5I24I17| with more precise information on how 
channels are used. The extra information can be classified into the channel-wise 
behavior, expressed by using usages ITTEHI . and the inter-channel dependency on 
the order of communications, expressed by using time tags 0. We first explain 
usages and time tags, and then sketch the type system. 



3.1 Usages 

A usage of a channel describes in which order the channel can be used for input 
and/or output. For example, we denote by 0 the usage of a channel that cannot 
be used at all, and by J.O.O the usage of a channel that can be first used for input 
and then used for output. U 1 WU 2 denotes the usage of a channel that can be 
used according to U\ by one process and U 2 by another process. *U denotes the 
usage of a channel that can be used according to U by infinitely many processes. 
The usage of an input-only channel HU is expressed as */.0, and that of a linear 
(use-once) channel cnj is expressed as /.0||0.0. So, usages are generalization 
of polarities and multiplicities of the linear 7r-calculus m- A channel used as a 
binary semaphore (Example EJ has the usage 0.0 || *1.0.0. 
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Capabilities and Obligations. Because we are concerned with whether each input 
or output succeeds, we associate each I and O with a set of two complemen- 
tary attributes called capabilities (denoted by c) and obligations (denoted by 
o). The capability attribute indicates that the communication is guaranteed to 
succeed. A guaranteed input or output operation (l'^) in Section |3 is allo- 
wed only when / or O is annotated with this input capability. We can refine 
the usage of a binary semaphore given above as 0.0 || *I^.O.O, to indicate that 
the semaphore can be eventually acquired. In order for the semaphore to be 
always acquired, however, the output denoted by O must be executed. It is 
expressed by the obligation attribute. So, the correct usage of a binary sema- 
phore is Oq.O 1 1 */t..Oo.O. Similarly, the usage of a linear channel is refined as 
Oco-0 II meaning that the channel must be used exactly once for input and 

output, and that the communication always succeeds. The usage of a channel 
used for client-server connection can be denoted by */q.O || *Oj,.0. */o.O means 
that a server must accept infinitely many request messages, and *0^.0 means 
that clients can successfully send infinitely many request messages. 

Channel- Wise Consistency. Creating a channel of some bad usage results in de- 
adlock. For example, creating a channel of usage /^..O || 0.0 may cause deadlock, 
because an input must succeed, but an output operation may not be executed. 
So, we require that the usage of each channel must be consistent (called reliable) 
in the sense that each input /output capability is guaranteed by the correspon- 
ding output/input obligation. This condition excludes out a deadlocked process 
(vx)x\'^[], which uses x according to the inconsistent usage 0^.0. 

3.2 Time Tags 

Controlling the channel-wise behavior is not sufficient to guarantee deadlock- 
freedom. For example, consider the process x?‘^[].j/![] | y?'^[]. x! []. It uses chan- 
nels X and y according to a consistent usage I^..O \ 0^.0 but it is in deadlock. This 
is because of the following cyclic dependencies between x and y: The condition 
that the input from x is a capability depends on the condition that the output 
on X is executed (i.e., is an obligation), which depends on the condition that 
the input from y succeeds (i.e., is a capability). But it further depends on the 
output obligation on y, which depends on the input capability on x. 

To avoid such cyclic dependencies, we associate each channel with a time 
tag and maintain the order of time tags. Let t^ be the time tag of a channel 
X and ty be that of y. Then, we allow a process to use capabilities on x before 
fulfilling obligations on y only if there is an ordering t^ < ty. In the above 
example, x?‘^[].y![] is allowed under the ordering t^ < ty but the other sub- 
process y!'^[].x![] is disallowed because it tries to use an input capability on y 
before fulfilling an output obligation on x. 

3.3 Typing 

By using usages and time tags, we can express the type of a channel in the form 
[ti, . . . , TnY /U. The part [ti, . . . , Tn] means that the channel is used for passing 
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a tuple of values of types ri, . . . , (ti,. . . ,Tn may also be channel types). U is 

the usage of the channel, and t is the time tag. 

A type judgment is of the form : ti, . . . , : r„; 7^ h P, where 7^ is a 

strict partial order on time tags. It means that P uses xi, . . . according to 
types Ti, . . . , Tn, and obeys the order TZ in performing communications. We give 
examples of type judgments. 

- X : /*I^.O^.O, y : {{tx, ty)} b P: The usage of x and 

y means that P uses x and y as binary semaphores. {(tx,ty)} indicates that 
P acquires x first when it acquires both semaphores at the same time: If it 
acquired x (i.e., used the input capability on x) first, it would obtain the 
obligation to release x (i.e., the output obligation on x), so it could not use 
the capability to acquire the semaphore y before fulfilling the obligation to 
release x. 

— / : [[intY'’ /O^.OY ^ 0 b P: The usage of / means that P tries to receive 
infinitely many messages from /. The type [intY'' /O^.Q means that a received 
value is a channel for which an integer must be sent. So, the above judgment 
indicates that P behaves like a server providing an integer: P waits to receive 
a channel repeatedly, and each time it receives a channel, it sends an integer 
back to the received channel. 

The following typing rule for input best illustrates how usages and time tags 
are enforced by our type system HH: 

P, X : [ti, . . . , TnY"/U, j/1 : Ti, . . . , : T„; 7e b P 

{hasob{P) Va = c)=^(6 = cV& = co) txTZP 

P,x: [n, . . . ,r„]*“’//J,.P;7^ b x?“[yi, . . .,yn]-P 

Because the first condition of the premise implies that P uses x according to U 
and because x?“[yi, P uses x for input before that, the total usage of x is 

of the form ly.U. The condition hasob{P) means that P contains an obligation 
on some channel, i.e., that an input or output operation must be performed 
on some channel. If so, the input from x must succeed. The attribute b should 
therefore contain the capability attribute. The last condition tpR-P means that 
if P contains an obligation on a channel, tx should be less than the time tag of 
that channel. 

The following rule for channel creation makes sure that only channels of 
consistent usages are created. 

P, X : [ti, . . . , TZ\- P reliable{U) 

In this manner, it is enforced that each channel is consistently used and that 
there is no cyclic dependency on communications through different channels. 
Thus, every closed well-typed process is guaranteed to be deadlock-free (see [3 
m for formal proofs). 
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4 Type System for Livelock-Preedom 

This section introduces our new type system that can guarantee the livelock- 
freedom property. We first analyze weaknesses of our previous type systems and 
explain how to remove them (in Section f4. 1 II . Then we define our new type system 
and show that it is sound in the sense that every closed well-typed process is 
livelock-free. 



4.1 Basic Ideas 

As mentioned in Section Q], a weakness of the type systems for deadlock- 
freedom mm is that it cannot completely guarantee the success of 
a communication because of the problem of divergence. In fact, the pro- 
cess (vx) {xl^[].0\{vy) {y\[x]\*yl[z].y\[z])) in Example El is well typed HU 
E3|: We can assign to x a type || O^.O) and to y a. type 

[[]**/Oo.O]‘“/(Oc-0 II The type of y demands that any process that 

has received a channel from y must send a value to it, or delegate an obligation 
to do so to other processes (by forwarding the received channel). The above 
process *yl[z\.y\[z\ certainly obeys this requirement: it forwards the received 
channel z to y, and according to the type of y, a receiver of z is supposed to 
fulfill the obligation to send a value to z. However, the receiver is actually this 
process itself and therefore the obligation is never fulfilled. 

The above problem comes from the fact that a received obligation x of type 
[]‘*/Oo-0 can be just forwarded as the same obligation of type /O^.Q. Ac- 
tually, however, because it takes some time to forward x, it should forward x as 
the obligation that should be fulfilled within a shorter time limit. This observa- 
tion leads us to replace the binary information on whether or not an obligation 
exists with a time limit within which an obligation must be fulfilled. Absence of 
an obligation is the special case where the time limit is infinite. Similarly, the 
capability attribute is also replaced by the maximum amount of time required 
before the communication succeeds. Thus, a usage O^.O is replaced by Ol° .0, 
which means that an output operation must be executed within time to (by an 
output or input operation being executed, we mean that a process becomes ready 
to output or input a value, not that a process succeeds to find the communica- 
tion partner and complete the communication), and if the output is executed, it 
is guaranteed to succeed within time t^- 

Now, let us reconsider the process (vx) (x?‘^[|. 0 | (:/y) (y! [x] | *y? [z]. y! [z])). 
Suppose that the channel y carries a channel for which an output must be exe- 
cuted within time to- Then, when the process *y?[z].y\[z] receives x, some time 
has already been spent for the communication on y. So, the process must fulfill 
the obligation to output on x within the time to minus the time spent for the 
communication on y. It therefore cannot forward x through y, which may require 
another time of length to until the output is performed. In this manner, infinite 
delegations of an obligation is forbidden, and livelock-freedom is guaranteed as 
a result. 
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The above change has also a good effect on formalization. An unpleasant 
point about the previous type systems was that the meaning of a channel type is 
not completely clear by itself because of time tags: The meaning of the time tags 
in a channel type is only determined by the order relative to time tags attached 
to other channel types. Now, the time tags are no longer required (actually 
they have been integrated into time bounds of capabilities and obligations). 
For example, the dependency between x and y in x? [].?/![] is captured by the 
condition “The input from x should succeed within a time limit shorter than the 
time limit of the fulfillment of an obligation to output on y.” This condition can 
be expressed in the typing rules more neatly than the corresponding condition 
on time tags in the previous type systems HH. 

4.2 Time Quantum 

Now we define our type system. We first introduce time quantums, which are used 
for giving time bounds of capabilities and obligations. We could define a time 
quantum simply as a natural number, expressing the number of reduction steps. 
To keep the flexibility, however, we use the following slightly more complicated 
definition. 

Definition 16 (time quantum). The set of time quantums, ranged over hy t, 
is defined by: 



Here, t ranges over the carrier set T of a well-founded order T = (T, ^). An 
element of T is called a time unit. We assume that T contains the least time 
unit tmin cind the greatest time unit too- n ranges over the set Nat of natural 
numbers. 

Intuitively, ti ^ t 2 means that ti is sufficiently shorter than t 2 so that the 
summation of any finite number of tis is shorter than t 2 . tmin represents the time 
required for one step reduction of a process, too represents an infinite length of 
time. These intuitions are expressed in the following definitions of the semantics 
of a time quantum and the order between time quantums. 

Definition 17 (semantics of time quantum). Let t be a time quantum. A 
mapping |t] from T to Nat is defined by: 



Remark 18. We do not distinguish between time quantums whose semantics are 
the same. For example, we identify (ti + 12) + (t 2 Tta) with ti + 2 • t 2 + ts. With 
this identification, every time tag can be expressed in a normal form 

Definition 19. The binary relation < on time quantums is defined by: ti < fy if 
and only if either (i) [fyKtoo) > 0, or (ii) for eacht G T, either |ti](t) < 1^2] (t) 
or there exists t' such that |ti](t') < 1^21 (t') and t C t'. We write fy < ^2 if 
(ti < fy) A -(t2 < h) 



t ::= 0 \ t \ ti 12 \ n ■ t 



I0](t) = 0 




|fy + fyl(t) = [til(t) + |fy](t) In • t](t) = n X |t]t 
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It is trivial that the binary relation < on time tags is a preorder. 

Example 20. Suppose ti^t 2 ^too- Then, n-ti < t 2 for any n S Nat. too+ti < 
too+i2 holds for any time quantums t\ and t2- So, too + t is essentially equivalent 
f O ^oo ■ 

The following lemma states an important property to guarantee livelock- 
freedom. It follows from the fact that the set of time units is well-founded. 

Lemma 21. The set of time quantums is well-founded with respect to <, i.e., 
there is no infinite decreasing sequence t^ > t\ > t 2 > ■ ■ ■ ■ 

4.3 Usages and Types 

As mentioned in Section lO we replace the capability/obligation attributes of 
a usage with two time quantums to,^c and remove the time tag from a channel 
type. 

Definition 22 (usages). The set VI of usages is given by the following syntax. 

U ■.■.= () \a\l.U\{U^\\U2)\*U 

a ■.-.= I \ 0 



Notation 23. We give a higher precedence to prefixes {al°. and +) than to || . 
So, II U 2 means || U 2 , not || U 2 ). We sometimes write a for 

/ or O. It denotes O when a = I and I when a = O. 

Ol°.U denotes the usage of a channel that can be first used for output, and 
then used according to U. Intuitively, to means that an output process must be 
executed within time tg. If too < to, the output need not be performed, to means 
that the output is guaranteed to succeed within time to after it is executed. More 
precisely, to is the time limit within which a communication partner is found 
and the communication starts: It takes tmin more until the communication is 
completed. If too < tc, then the output is not guaranteed to succeed. ll°-U 
denotes the usage of a channel that must be first used for input within time to, 
and then used according to U if the input succeeds. The input is guaranteed to 
succeed within time t,,. The meaning of the other usage constructors is the same 
as that in the previous type systems (see Section ED- 

Definition 24 (types). The set of types is given by the following syntax. 

T ::= bool \ [n, . . . ,r„]/U 

[ti, . . . , Tn]/U denotes the type of a channel that can be used for communicating 
a tuple of values of types ri, . . . , r„. The channel must be used according to the 
usage U. 

We introduce several operations on usages and types. {t^J represents the 
usage of a channel that is used according to U after a delay of at most t. 



Type Systems for Concurrent Processes 377 



Definition 25. A unary operation on usages is defined induetively by: 



tj) = 0 

3t/l||t/2)=0C/l||0C/2 



to 



tc 



.U = a 

(iu) = *{t]u 



.u 



Constructors and operations on usages are extended to operations on types. 
Definition 26. Operations on types are defined by: 



bool\ \ bool = bool {[^]/Ui) || {[t]/U2) = [t]/{Ui \ \II2) 

*bool = bool *[t]/Ui = [f]/*Ui 

l^&ooZ = bool 

4.4 Reliability of Usages 

As in the type systems for deadlock- freedom (see Section the usage of each 
channel must be consistent (reliable) in the sense that each input /output capabi- 
lity is guaranteed by an output /input obligation. For example, consider a usage 
//“.C/i II 0 \“^.U 2 - In order for the success of input to be guaranteed, it must be 
the case that to <tc- This consistency should be preserved during the whole com- 
putation. After communication on a channel of usage -C^i 1 1 .U2 happens, 

the channel is used according to C/i 1 1 C/2. So, Ui 1 1 U2 must also be consistent. To 
state such a condition, we first define reduction of a usage. 

Definition 27. = is the least congruence relation satisfying the following rules: 



U^U \\0 * 0^0 

Ui II U2 = U2 II Ui (C/i II U2) II c/3 = Ui II (C/2 II c/3) 

*U^*U\\U *(C/i IIC/2) = *C/i ||*C/2 

= *u 

Definition 28 (usage reduction). A binary relation — > on usages is the 
least relation closed under the following rules: /i/l//.C/i || 0*,°.C/2 || C/3 — > 

Ui II U2 II C/3, and (ii)Ui — ^ C/2 if Ui ^ U[, U[ — ^ C/^, and ^U2- 

Now, a usage U is defined to be reliable if after any reduction steps, whe- 
never it contains an input/output capability (i.e., it is structurally congruent 
to II C/2), it contains an output/input obligation of an equal or shorter 

time bound. We first define a predicate to judge whether a usage contains an 
obligation, and then give a formal definition of the reliability. 

Definition 29 (obligation). The relation obj, obo{f= U) between a time 
quantum and a usaqe are defined by: ob„(t,U) if and only if t„ < t or 

3t„,t.,C/r,C/2.((C/^otC^il|C^^)A(Co</)). 
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Definition 30 (reliability). U is reliable, written rel{U), if oba{tc,U 2 ) holds 
whenever U — >*= || 1 / 2 - A type r is reliable, written rel{T), if it is of the 

form [f]/U and rel{U) holds. 

Remark 31. The above definitions of oba and the reliability do not completely 
match the intuition on usages. Consider a usage U = .0 1 1 .0 of 

a binary semaphore. According to the above definitions, U is reliable. Actually, 
however, an input operation may not succeed within time t: Although an output 
operation is always executed within t, the input may be delayed because of other 
input operations. If another process succeeds to input from the channel, it must 
wait for another period of length t. So, U should not be reliable. (Under the fair 
scheduling, every input succeeds after a finite number of output operations. So, 
according to the intuition, O^^.O || */*,“. Oj^.O should be reliable if t <ct'.) For 
the guarantee of livelock-freedom, however, the above definition of the reliability 
is sufficient and much simpler. We will redefine the reliability in Section 0 to 
guarantee time-boundedness. 



4.5 Subusage and Subtyping 



The subusage relation defined below allows one usage to be viewed as another 
usage. For example, consider the usage /*“ .0 1 1 /*“ .0 of a channel that can be 
used twice for input, possibly in parallel. Because the input is not an obligation, 
it should be allowed to use the channel for input just once for now, and use once 
more only after the input succeeds, as expressed by Such a relation 

between usages is expressed by /* “ .0 1 1 .0 < “ ./* “ .0. 

Definition 32. The subusage relation < on usages is the least reflexive and 
transitive relation closed under the following rules: 

\°° ~ (SubU-Zero) 

a“.C/<0 ^ ^ 

t'C — 



U^U' 
U <U' 



(SubU-Cong) 



Ui <U[ U 2 < U'2 

Ui II U2 < U[ II 



(SubU-Par) 



U <U' 
*U < *U' 



(SubU-Rep) 



■Ui II to + + tmin U 2 < Q!j“.(17i II U 2 ) 



(SubU-Delay) 



U<U' t'o < to to < 
a\\U < a^f.U' 



(SubU-Act) 



The rule (SubU-Zero) indicates that a channel with no obligation need not 
be used at all. The rule (SubU-Delay) allows some usage to be delayed until a 
communication succeeds. In the usage al°.{Ui || U 2 ), an obligation contained in 
U 2 is delayed until the operation a (which is I or O) is executed and it succeeds. 
Because the time to may pass before it is executed and another time period of 
length to may pass before it is enabled, the time to -I- to -f tmin may be required 
before obligations in U 2 is fulfilled. It is taken into account by the operation 
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to + tc + tmin in the lefthand side. The rule (SubU-Act) means that it is safe 
to estimate the time bound of an obligation to be shorter than the actual time 
bound, while the time bound of a capability should be estimated to be longer 
than the actual bound. 

The subusage relation can be extended to the following subtyping relation. 



Definition 33 (subtyping). A subtyping relation < is the least relation closed 
under the rules: (i)bool < bool, and (ii) [f]/U < [f]/U' ifU< U' . 



Definition 34. noob{r) if t = bool or t = [f]/U and U < 0 . 



4.6 Type Environment 



A type environment is a mapping from a finite set of variables to types. We use 
a metavariable F for a type environment. We use a sequence : ti, . . . , : t„ to 

denote the type environment F such that dom{F) = {vi, ... ,Vn}\{true, false} 
and F{vi) = Ti for each Vi G dom{F). In the sequence, if Vi = true or Vi = 
false, Ti must be bool. (So, the sequence x:t, true : bool denotes the same type 
environment as x:t. The sequence true: [t]/C/ is invalid.) We write 0 for the 
type environment whose domain is empty. When x dom{F), we write F,x:t 
for the type environment F' satisfying dom(F') = dom{F)U {x}, F'{x) = r, and 
F'{y) = F{y) for y G dom{F). T\{xi, . . . , denotes the type environment 
F' such that dom{F') = dom{F)\{xi, . . . ,x„} and F'{x) = F{x) for each x G 
dom{F'). 

The operations on types are pointwise extended to those on type environ- 
ments as follows. 

Definition 35 (operations on type environments). The operations 1 1 , |Tj 
on types are defined by: 



dom{Fi II T2) 
(A||T2)(x) = 



dom{Fi) U dom{F2) 

f Fi{x) 11/2(2^) tf X G dom{Fi) fl dom(/2) 
Fi(x) if X G dom{Fi)\dom{F2) 

[ l2(x) if X G dorn{F2)\dorn{Fi) 



dom{*F) = dom{F) 
dow(|T]T) = dom{F) 



i*F){x) = *{F{x)) 
(0T)(x)=0(T(x)) 



Definition 36. F is reliable, written rel{F), if rel{F{x)) holds for each x G 
dom{F) . 



Definition 37. A binary relation < on type environments is defined by: F\ < I2 
if and only if (i) dom{Fi) A dom{F2), (ii) Fi{x) < /2(x) for each x G dom{F2), 
and (Hi) noob{Fi{x)) for each x G dom{Fi)\dom{F2). 




380 N. Kobayashi 




Fig. 1. Typing Rules 



4.7 Typing Rules 



The typing rules are shown in Figure Q Each rule is explained below. 

(T-Par) The premises imply that Pi uses variables as described by Fi, and in 
parallel to this, P 2 uses variables as described by ^ 2 - So, the type environment 
of Pi I P 2 should be represented as the combination Pi ||P 2 - For example, if 

Pi = x: []//‘“.0 and P 2 = x: []/0\? .0, then Pi \ P 2 should be well typed under 

x:[]/(j‘:.0||og.O). 

(T-Rep) Because *P runs infinitely many copies of P in parallel and the premise 
implies that each P uses variables according to P, *P uses variables according 
to *P as a whole. 

(T-New) (z/x) P is well typed if P is well typed and it uses x as a channel of a 
reliable usage. 

(T-If) Since if v then P else Q executes either P or Q, P and Q must be 
well typed under the same type environment P. Assuming that it takes the time 
tmin to check whether v is true or false, we can express the total use of variables 
by if V then P else Q 



as 



in F II : bool. 

(T-Weak) P < P' means that F represents a more liberal use of variables than 
F'. So, if P is well typed under F', so is under F. 

(T-Out) The lefthand premise implies that F uses x as a channel of usage U 
and uses other variables according to F. Because x!“[ui, . . . , u„]. F uses x for ou- 
tput before that, the total usage of x is expressed by 0\°.U. Other variables are 
used by P and the receiver of [ui , . . . , v„] according to ui : ti 1 1 • • • 1 1 1 1 F 

only after the communication on x succeeds. Because it may take the time 



tc until the communication is enabled and it takes tr 



more before the 
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communication completes, the total use 

tc “t” tmin (t^l ■ II ' 



Vn'.Tn 



of other variables is estimated as 
r). The righthand premise requires that the 
time bound of the capability must be finite if the input is annotated with c. 
(T-In) This is similar to the rule (T-Out). Because P uses x according to the 
usage U, the total usage of x is expressed as Because x?“[yi, P 

executes P only after the com munication on x has completed, the total use of 
other variables is estimated as tc + tmin P- 



4.8 Type Soundness 

Now we show that our type system is sound in the sense that any closed well- 
typed process is live lock-free (Corollary 14211 . We omit most of the proofs but 
present a proof sketch of the main theorem (Theorem 14 1 1) because it may be 
particularly interesting: While the usual type soundness refers to a safety pro- 
perty that a bad thing never happens, livelock-freedom is a liveness property 
that a good thing eventually happens. Full proofs are given in the full version of 
this paper jOj. 

Type soundness comes from the subject reduction theorem ^Theorem lltqjl . 
which states that well-typedness of a process is preserved by reduction, plus 
a property that some progress is always made by reduction (Lemma 141)11 . As 
in the linear 7r-calculus, the type environment of a process may change du- 
ring reduction. For example, while a process a;![] |x?[].0 is well typed under 

x: []/(Oj“.0 II /*,“.0), the reduced process 0 is well typed under x:[]/0. This 

change of a type environment is captured by the following relation P — ^ A. 

Definition 38. A 3-place relation P — ^ A is defined to hold if one of the 
following conditions holds: 

1. I = e and P = A. 

2. I = X, P = {P' ,x : [t]/U) , A = {P' ,x:[t]/U'), and U — >■ U' for some 
P' , f, U , and U' . 

We write P — > A when P — ^ A for some 1. 

Theorem 39 (subject reduction). If P \- P and P — ^ Q, then there exists 
A such that A\- Q and P — ^ A. 

The following lemma states that after a communication succeeds, the proces- 
ses that have been blocked by the communication can fulfill obligations within 
shorter time bounds. 

Lemma 40. If r,x:[f]/U h x!“[x].P|x?“ [y].Q, then there exist A,U' such 
that A, X : [f]/U' \~ P \ [y ^ v]Q , P < 

The following main theorem says that any obligation with a finite time bound 
is eventually fulfilled (the properties A and B below), and that any capability 
with a finite time bound can eventually be used (the properties C and D). 



tmin A, and U 



U'. 
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Theorem 41. Suppose that Pq — > Pi — > P2 — > ■ ■ ■ is a full fair reduction 
sequence. If t < too, the following properties hold. 

A (Output Obligation). If (i) P h Pq, (ii) rel{P), (Hi) P = P',x: [f]/U, and 
(iv) oboft, U), then there exists n > 0 such that ^ (vw) (a;!“['D]. Qi \ Q2)- 
B (Input Obligation). If (i) P h Pq, (ii) rel{P), (Hi) P = P',x: [t]/U, and (iv) 

objft, U), then there exists n > 0 such that Pn P {vw) {xl°‘[ij\. Q\ \ Q2)- 

C (Output Capability). If (i) Pq ^ x\°'[v\.Q\R, (ii) Ai,x\[f]/ 0 \° .U\ h 

x\°'[v].Q, (Hi) A2 h R, and (iv) rel{{Ai,x:[T]/Ol° .Ui)\\ A2), then 
there exists n > 0 such that Pn ^ {vw) {x\°'[v\.Q\xl°' [ij\. R\\ R2) and 
{vw) {Q\[y<-^ v]Ri I R2) >: Pn+i- 

D (Input Capability). If (i) Pq ^ xl°'[ij\.Q\R, (ii) Ai,x \[f]/ ll° .U\ h 

a:?“[y].Q, (Hi) A2 h R, and (iv) rel{{Ai,x:[T]/ll° .Ui)\\A2), then there 
exists n > 0 such that Pn P {vw) {xP^ly]. Q \ x\°‘ [ii]. Ri\ R2) and 

{vw) {[y v]Q I i?i I R2) h Pn+i- 

Proof Sketch. The proof proceeds by induction on t (notice that the relation < 
on the subset {t \ t < too} of time quantums is well-founded: there is no infinite 
decreasing sequence). We show only the properties A and C. B and D are similar. 
Suppose the theorem holds for any t' such that t' < t. 

A. By the assumption P h Pq, there must exist Poi and Pq2 such that (a) 
Pq h {vw) (Poi 1^02), (b) Pi,x: [f]/{0\^.U3. || C/4) b Pm, and (c) Pqi is an 
output, input, or conditional process. Without loss of generality, we can 
assume that {vw) (Pqi I -P02) — Pi- If Pm is of the form x\°‘[v\. R, the result 
immediately holds. Otherwise, Pqi must be if b then Pqj^ else Pq} with b 
being true or false, or trying to use an input or output capability on another 
channel with a time limit shorter than t. 

In the former case, by the assumption of the fairness, Pqi is eventually re- 
duced to P()i or P}}. By the typing rules, Pq^ and P}} are well typed under 
Pi, a; : [t]/{01„.U3 || U4) for some t' such that t' -I- tmin < t (which implies 
/' < t as / < too). So, the required result follows from the property A of 
induction hypothesis. 

In the latter case, by the property C of induction hypothesis, Pqi must be 
eventually reduced. By Theorem FTHland Lemma EOI the resulting process is 
well typed under some type environment A,x: [f]/{Ol„.UQ || Uq) such that 
t' < t. (See the full paper jOI for details.) So, the required result follows from 
the property A of induction hypothesis. 

C. Suppose there exists no such n. Then, it must be the case that Pi P 
x\°‘[v\.Q\Ri and R — >■ Pi — P2 — >■ ••• is a full fair reduction 
sequence. We show that there exist infinitely many i such that Ri P 
{vw) (cc?“ [y\.R'i \R'i), which contradicts with the assumption that the re- 
duction sequence Pq — > Pi — > P2 — > ■ ■ ■ is fair. Let j be an arbitrary na- 
tural number. Then, by subject reduction (Theorem 03 , A' ,x: [t]/U' h Rj 
and Z\2 — {A',x:[f]/U') for some A' and U' . By the assumption 
(iv) and the definition of the reliability, obi{t,U') must hold. We also 
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have that F' ,x :[f\/ {0\° .U\\\U') h x\°‘[v\.Q\Rj and rel{F') for F' = 
Ai \ \A'. Therefore, by the property B, there exists i > j such that Ri >: 
{ixw) {xl°‘ \y\. i?' I R'l). Thus, we have shown that there exist infinitely many 
i such that Ri F (yw) {xl°‘ [y]-Ri \ R'l)- □ 

Corollary 42 (livelock freedom). // 0 h P, then P is livelock-free. 

Proof. Suppose 0 h P and P — >■* Pi F (a;!'^[h]. Q | P). By Theo- 
rem El w:f\- Q I P and rel{w:f). Moreover, by the typing rules, it 

must be the case that Ai, x : [t']/OIfUi h x\‘^[v].Q, tc < too, A2 b P, and 
rel{{Ai,x : [r']/Oj°.Pi) || Z\ 2 ). So, the required result follows from the property 
C of Theorem The case for input is similar. □ 

5 Time-Bounded Processes 

In this section, we briefly show that with a minor modification, our type system 
can guarantee not only that certain communications eventually succeed, but also 
that some of them succeed within a certain number of (parallel) reduction steps. 



5.1 Time-Boundedness 

We first define the time-boundedness of a process. We extend the syntax of 
processes to allow a programmer to declare an upper-bound of the number of 
reduction steps required until an input or output operation succeeds. 

P ;;=••• I xF[v].P\ x?"[y].P 

The annotation n of x\^[v\. P indicates that the programmer wants this output 
to succeed within n parallel reduction steps (defined below) after it is executed. 
(Strictly speaking, the output process can find its communication partner within 
n reduction steps and complete the communication in the next step.) 

In counting the number of reduction steps, we assume unlimited parallelism, 
so that communications on different channels can occur in parallel. To model 
such parallel reduction, we introduce parallel reduction relations P Q and 

g 

P Q where S' is a set of channels. P Q means that P is reduced to Q 
by reducing every conditional expression and performing one communication on 

g 

every channel whenever possible. P Q is the same as P Q except that 
communications on free channels occur only on those in the set S. We assume 
here that at most one communication can occur on each channel and that the 
two processes to communicate is chosen randomly. So, it takes two steps to 
reduce *a;? []. 0 | a;! [] |x![] to *a:?[].0. It would be possible to change reduction 
rules and the type system so as to make as many communications as possible 
occur simultaneously on each channel and/or to reflect a certain scheduling. 
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Definition 43 (parallel process reduction). (where S is a set of 

variables) and are the least relations closed under the rules given below. a~ 
is defined by: n~ = n — 1, c~ — c, and 0“ = 0. 



[v].P\xl^ 



' [z].Q P\[zi-^ v]Q 



0 




0 




0 



if true then P else Q P 



if false then P else Q Q 




n ^2 = 0 



p>p' P' ^Q' Q'hQ 





P 



s 



Q 




p 



Q 



The rules in the first two lines ensure that in each parallel reduction step, every 
input or output process is either reduced or its time bound is decreased by 1. 
The rules in the third line ensure that every conditional process is reduced. The 
righthand premises of the rule for {vx) P and the last rule make sure that a 
communication happens on every channel whenever possible. 

A process is time-bounded if whenever the time bound of an input or output 
process has become 0 (i.e., it becomes P or xl^[y].P), the input or output 

operation always succeeds in the next parallel reduction step: 

Definition 44 (time-boundedness). A process P is time-bounded if the fol- 
lowing conditions hold whenever P =^* P' . 



1. If P' ^ (z/rc) (a:!°[{;]. Q I i?), then R and R ^ {i^u) {xl°^[y]. Ri \ R 2 ) for 



some u, i?i, i?2- 

2. If P' ^ {i^w) {x7^[y].Q\R), then R 7^ and R ^ {vu) {xP[v]. Ri \ R 2 ) for 
some u, V, i?i, i?2- 

5.2 Modification to the Type System 

Now we show how to modify the type system to guarantee the time-boundedness. 
Because the typing rules in Section El already take into account the delay cau- 
sed by communications on other channels, we just need to refine the reliability 
condition to estimate the channel-wise behavior more correctly. As stated in 
Remark a problem of Definition 00 is that it does not take race conditions 
into account. For example, .0 1 1 /J” .0 1 1 /J” .Ot„c ^ .0 is reliable according 

to Definition 00 but only one input is guaranteed to succeed immediately: The 
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other input must wait for a time period of length tmin until an output is executed 
again. So, the correct usage should be II II 

We redefine usage reduction to take race conditions into account. For exam- 
ple, O°^.0 II II is reduced to O°^.0 || /g“.O°^.0, which 
can be further reduced to 0 ° . 0 . To define a new usage reduction relation, 

we introduce auxiliary relations where {com, I, O} C S. When com € S', 

g 

U V means that one pair of an input usage and an output usage is reduced. 
I € S {O € S, resp.) indicates that an input (output, resp.) action is ready but 
kept waiting (either because no output action is ready or because another input 
is chosen for communication). 

g 

Definition 45 (timed usage reduction). Binary relations and ( 
{com, 7 , 0 } C S) on usages are the least relations closed under the following 
rules: 



Ill-U, II ol?.U2 



{com} 



U1WU2 



0 < tc 



o=i:^o 

0 < to 



a °.C/ 

tc 



C/i 



U[ 



U2 



UL 



com ^ Si n So 



u 



tfj JT 0 . tfy TJ 

ctf°.U .U 

s 



U' 



com 4 S 



U,\\U2^^^Ui\\U^ 



*u 



*U' 



U^U' 



U' 



V 



V'^v 



u^v 



com G S if { 1 , 0} C S 



U^V 



Here, t = (n — 1 ) • t^ 



U^V 

'^f PI = ■ tminl, and t~ = t otherwise. 



The left rule in the second line is for the case where the (input or output) action 
a has already been executed but does not succeed in this reduction step: in this 
case, the time limit of the capability is reduced by tmin- The right rule in the 
second line is for the case where the action a has not been executed yet: in this 
case, the time limit of the obligation is reduced by tmin- The right rule at the 
bottom ensures that in each reduction step, if both input and output actions are 
ready, some pair of an input usage and an output usage must be reduced. 

Using the above parallel usage reduction, we can strengthen the reliability 
condition as defined below. The first condition ensures that when the time limit 
of an input or output capability has reached 0, the input or output operation 
must succeed in the next step. 



Definition 46 (reliability (refined)). A usage U is reliable if: 



1. If U =^*= aff.Ui\\U2, then U2 and there exist tc,U^,Ui such that 

U2 = II U4; and 

2. U — >■*= al°.Ui II C/2 implies oba{tc,U2). 



We introduce the following typing rules for new processes. 
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r,x:[Ti,...,Tn]/U'rP tc<n-t^ 



tc + tr 



{vi : Ti 



Vn'- Tn 



r) II a: : [n, . . . ,T„]/Ot“.C/ h . . .,Vn].P 

(T-BOut) 



P,x : [ti, . . . , Tn]/U, yi : Ti,. . . ,Un ■■ Tn\- P tc<n-tr 



tc + tr 



r,x : [Ti,...,Tn]/It°-U\- x7’^[yi,...,yn].P 



(T-BIn) 



We can prove the following time-boundedness theorem. The proof is basically 
similar to that of the livelock-freedom property ICorollarv B^ : It suffices to prove 
a parallel version of the subject reduction property (Theorem E5I) (although it 
is harder). 

Theorem 47. Every closed well-typed process is time-bounded. 



6 Extensions with Dependent Types 

The typed livelock-free process calculus in Section 0is not expressive enough. For 
example, consider the following process P, which is a function server computing 
the factorial of a natural number: 

*fact7[n,r\. (if n = 0 then r![l] else {vr') {fact\[n — l,r'] | r'7[m].r\[m x n]) ) 

P I {^y) (/act! [3, 2 /]. I y7'^[x\. 0) cannot be judged to be livelock-free in our type 
system. Suppose r has type [Nat]/O‘°.0. Because r' is passed through the chan- 
nel fact, r' must have the same type [Nat]/O(°.0. According to the rule (T- 
In), in order for r'?[m].H[m x n] to be well typed, it must be the case that 
to + tmin < to, which implies too < to- So, it is not guaranteed that P returns a 
result eventually. 

The problem of the above example is that the time bound of an output 
obligation on r depends on the other argument n. We can use dependent types 
to express such dependency: In the above process, the type of the channel fact 
can be expressed as [En : Nat.[Nat]/Og"''*'"‘".0]/17. 

7 Related Work 

Our Previous Type Systems for Deadlock-Freedom. As mentioned earlier, the 
type system in this paper originates from our previous type systems for deadlock- 
freedom P23II1 . We expect that we can reconstruct a type system for deadlock- 
freedom by changing the definition of the order between time tags (for example, 
by identifying ti -|-t 2 with t 2 if ti ^t 2 ). Nice points about the new type system 
are that the intuitive meaning of a channel type is clearer, and that we can get rid 
of complex side conditions on time tags (which were introduced to get enough 
expressive power) in the typing rules of the previous type systems [3231 ■ We 
expect that we can recover much of the expressive power by using more standard 
concepts like dependent types and polymorphism as described in Section 0 
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Other Type Systems for Analyzing Similar Properties of Concurrent Processes. 
There exist a few other type systems for 7r-calculus-like languages (where chan- 
nels are first-class data) that guarantee a livelock-freedom property. As far as we 
know, however, they deal with more specific situations: Sangiorgi’s type system 
for receptiveness [22| enforces that an input process is spawned immediately after 
a certain channel (called a receptive name) is created, and therefore guarantees 
that every output on that channel succeeds immediately. Puntigam and Peter’s 
typed concurrent object calculus m guarantees that certain reply messages 
(which they call promised messages) eventually arrive. 

There are a few other type systems that deal with deadlock-freedom ITfroi 
12 fi] . Please refer to our previous paper m for comparisons with them. 

Abadi and Flanagan recently proposed a typed concurrent object calculus 
that can prevent race conditions. Our type system can also be used to prevent 
race conditions. Notice that a race condition occurs only when more than one 
processes try to input or output on the same channel simultaneously. Such a 
situation can be detected by looking at the usage of each channel: If the usage of 
a channel can be reduced to a usage of the form || .U 2 || ■ ■ more than 

one processes may try to perform an input on the channel at the same time. 
Abadi and Flanagan’s race detection is, however, more sophisticated because 
dependencies between different channels (or locks) are taken into account. We 
might be able to extend our type system to subsume it by extending a channel 
usage so that it expresses the use of a group of channels, instead of that of each 
channel (see discussions on an extension to a deadlock-free concurrent object 
calculus in nn). 

Type Systems for Bounding Execution Time of Sequential Programs. There are 
several pierces of work that try to statically bound running-time of sequential 
programs m- Most closely related (especially to the extension sketched in 
Section 0) seems to be Crary and Weirich’s work |2j , which also uses dependent 
types. A difficulty in bounding running-time of a concurrent process is that unlike 
sequential programs where a function/procedure call is never blocked, a process 
may be blocked until a communication partner becomes ready. We have dealt 
with this difficulty in this paper by associating each input/output operation 
with two time bounds: a time bound within which the operation is executed, 
and another time bound within which a communication partner becomes ready. 

Abstract Interpretation. An alternative way to analyze the behavior of a concur- 
rent program would be to use abstract interpretation j^. Actually, from a very 
general viewpoint, our type-based analysis of deadlock and livelock can be seen 
as a kind of abstract interpretation. We can read a type judgment P \- P as “P 
is an abstraction of a concrete process P.” (The relation “h” corresponds to a 
pair of abstraction/concretization functions.) Indeed, we can regard a type envi- 
ronment as an abstract process: we have defined reductions of type environments 
in Section 

The subject reduction property iTheorem 1.3^11 can be interpreted as “whene- 
ver a concrete process P is reduced to another concrete process Q, an abstraction 
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r oi P can also be reduced to another abstract process A which is an abstraction 
of Q.” In other words, every reduction step of a concrete process is simulated by 
reduction of its abstract process. So, it corresponds to a consistency condition of 
abstract interpretation. The reliability condition iDefinition ldOll guarantees that 
an abstract process never falls into a livelock. Thus, a concrete process is also 
guaranteed to be livelock- free. 

8 Conclusion 

In this paper, we have extended our previous type systems for deadlock-freedom 
to guarantee livelock- freedom and time-boundedness properties. A number of 
practical issues remain to be solved in applying these type systems to real pro- 
gramming languages, such as how to combine dependent types, polymorphism, 
etc. to obtain a reasonable expressive power and how and to what extent to let 
programmers declare type information. 

A key idea common to those type systems is to decompose the behavior of 
a whole process into that on each communication channel, which is specified 
by using a mini-process calculus of usages. This idea would be applicable to 
other analyses, such as race detection (as mentioned in Section [^1 and memory 
management. As for an application to memory management, we have already 
applied a similar idea to analyze how and in which order each heap cell (instead 
of a communication channel) is used in functional programs 0. 

Acknowledgment 

We would like to thank Atsushi Igarashi, Eijiro Sumii, and Nobuko Yoshida for 
useful comments. 



References 

1. G. Boudol. Typing the use of resources in a concurrent calculus. In Proceedings of 
ASIAN’97, pages 239-253, 1997. 

2. P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for 
static analysis of programs by construction or approximation of fixpoints. In Pro- 
ceedings of ACM SIGPLAN/SIGACT Symposium on Principles of Programming 
Languages, pages 238-252, 1977. 

3. K. Crary and S. Weirich. Resource bound certification. In Proceedings of ACM 
SIGPLAN/SIGACT Symposium on Principles of Programming Languages, pages 
184-198, 2000. 

4. C. Flanagan and M. Abadi. Object types against races. In CONCUR’99, LNCS 
1664, pages 288-303. Springer- Verlag, 1999. 

5. S. J. Gay. A sort inference algorithm for the polyadic 7r-calculus. In Proceedings of 
ACM SIGPLAN/SIGACT Symposium on Principles of Programming Languages, 
pages 429-438, 1993. 

6. M. Hofmann. Linear types and non-size-increasing polynomial time computation. 
In Proceedings of IEEE Symposium on Logie in Computer Scienee, pages 464-473, 
1999. 



Type Systems for Concurrent Processes 389 



7. N. Kobayashi. A partially deadlock-free typed process calculus. ACM Transac- 
tions on Programming Languages and Systems, 20(2):436-482, 1998. A preliminary 
summary appeared in Proceedings of LICS’97, pages 128-139. 

8. N. Kobayashi. Quasi-linear types. In Proceedings of ACM SICPLAN/SIGACT 
Symposium on Principles of Programming Languages, pages 29-42, 1999. 

9. N. Kobayashi. A livelock-free typed process calculus. Technical report, Dept. 
Info. Sci., Univ. of Tokyo, 2000. To appear. Available at http://www.yl.is.s.u- 
t okyo .ac.jp/'koba/publications.html. 

10. N. Kobayashi, B. C. Pierce, and D. N. Turner. Linearity and the pi-calculus. 
ACM Transactions on Programming Languages and Systems, 21(5):914-947, 1999. 
Preliminary summary appeared in Proceedings of POPL’96, pp. 358-371. 

11. N. Kobayashi, S. Saito, and E. Sumii. An implicitly-typed deadlock-free process 
calculus. Technical Report TROO-01, Dept. Info. Sci., Univ. of Tokyo, January 
2000. Available at http://www.yl. is. s. u-tokyo.ac.jp/~koba/publications.html. A 
summary will appear in Proceedings of CONCUR 2000, LNCS, Springer- Verlag. 

12. N. Kobayashi and A. Yonezawa. Towards foundations for concurrent object- 
oriented programming - types and language design. Theory and Practice of Object 
Systems, l(4):243-268, 1995. 

13. R. Milner. The polyadic vr-calculus: a tutorial. In F. L. Bauer, W. Brauer, and 
H. Schwichtenberg, editors. Logic and Algebra of Specification. Springer- Verlag, 
1993. 

14. R. Milner, J. Parrow, and D. Walker. A calculus of mobile processes, I, II. Infor- 
mation and Computation, 100:1-77, September 1992. 

15. U. Nestmann. What is a ‘good’ encoding of guarded choice? In EXPRESS’97, 
volume 7 of ENTCS. Elsevier Science Publishers, September 1997. 

16. S. Peyton Jones. Concurrent Haskell. In Proceedings of ACM SICPLAN/SIGACT 
Symposium on Principles of Programming Languages, pages 295-308, 1996. 

17. B. Pierce and D. Sangiorgi. Typing and subtyping for mobile processes. Mathe- 
matical Structures in Computer Science, 6(5):409-454, 1996. 

18. B. C. Pierce and D. N. Turner. Concurrent objects in a process calculus. In Theory 
and Practice of Parallel Programming (TPPP), Sendai, Japan (Nov. 1994), LNCS 
907, pages 187-215. Springer- Verlag, 1995. 

19. F. Puntigam. Coordination requirements expressed in types for active objects. In 
Proceedings of ECOOP’97, LNCS 1241, pages 367-388, 1997. 

20. F. Puntigam and C. Peter. Changeable interfaces and promised messages for con- 
current components. In Proceedings of the 1999 ACM Symposium on Applied Com- 
puting, pages 141-145, 1999. 

21. J. H. Reppy. CML: A higher-order concurrent language. In Proceedings of the ACM 
SIGPLAN’91 Conference on Programming Language Design and Implementation, 
pages 293-305, 1991. 

22. D. Sangiorgi. The name discipline of uniform receptiveness (extended abstract). 
In Proceedings of ICALP’97, LNCS 1256, pages 303-313, 1997. 

23. E. Sumii and N. Kobayashi. A generalized deadlock-free process calculus. In 
Proc. of Workshop on High-Level Concurrent Language (HLCL’98), volume 16(3) 
of ENTCS, pages 55-77, 1998. 

24. V. T. Vasconcelos and K. Honda. Principal typing schemes in a polyadic 7r-calculus. 
In CONCUR’93, LNCS 715, pages 524-538. Springer- Verlag, 1993. 

25. A. Yonezawa and M. Tokoro. Object-Oriented Concurrent Programming. The MIT 
Press, 1987. 

26. N. Yoshida. Graph types for monadic mobile processes. In PST/TCS’16, LNCS 
1180, pages 371-387. Springer- Verlag, 1996. 




Local TT-Calculus at Work: 
Mobile Objects as Mobile Processes* 



Massimo Merro^**, Josva Kleist^, and Uwe Nestmann^ 



^ INRIA, Sophia-Antipolis, France 
^ BRIGS - Basic Research in Computer Science, 
Danish National Research Foundation, 
Aalborg University, Denmark 



Abstract. Obliq is a distributed, object-based programming language. 
In Obliq, the migration of an object is proposed as creating a clone of 
the object at the target site, whereafter the original object is turned into 
an alias for the clone. Obliq has an only informal semantics, so there is 
no proof that this style of migration is correct, i.e., transparent to object 
clients. In this paper, we focus on Ojeblik, an abstraction of Obliq. We 
give a TY-calculus semantics to Ojeblik, and we use it to formally prove 
the correctness of object surrogation, an abstraction of object migration. 



1 Introduction 

The work presented in this paper is in line with the research activity to use 
the TT-calculus as a toolbox for reasoning about distributed object-oriented pro- 
gramming languages. Former works on the semantics of objects as processes 
have shown the value of this approach: while have focused on just 

providing formal semantics to object-oriented languages and language features, 
the work of others has been driven by a specific programming problem. 

Our work tackles a problem in Obliq, Cardelli’s lexically-scoped distributed pro- 
gramming language 0. In this setting, Cardelli proposed to implement object 
migration by creating a clone of the object at the target site and then turning the 
original (local) object into an alias for the new (remote) object. The question 
arises, whether this style of object migration is correct, and how that can be 
stated formally. However, Obliq is not equipped with a formal semantics, apart 
from an unpublished note by Talcott EH, which provides a configuration-style 
semantics for a subset of Obliq excluding migration. The aim of our project uni 
is to remedy this lack of formality and to reason formally about migration. 

Previous work Since Obliq is lexically scoped, we may ignore the aspects of 
distribution, at least when regarding the results of Obliq computations, unless 
distribution sites fail. Following this idea, Ojeblik, which we introduce in Sec- 
tional is an object-based language that represents Obliq’s concurrent core m, 
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but it can also be seen as a concurrent extension of the Imperative Object Calcu- 
lus Ojeblik supports surrogation, a distribution-free abstraction of migration. 

In 1161 we gave a formal definition of correctness for object surrogation in 
Ojeblik which can be straightforwardly extended to object migration in Obliq. 
The intuition is that, in order to be correct, the surrogation (resp. migration) 
of an object must be transparent to the clients of that object, i.e., the object 
must behave the same before and after surrogation (resp. migration). We have 
formalized this concept as the simple equation a. ping = a. surrogate where the 
left side represents the object a before surrogation (a. ping returns a if reachable), 
the right side represents the object a after surrogation, and = is an appropriate 
contextual equivalence, based on the possibility of convergence. 

In 1 1 6j we have also given several proposals of configuration-style semantics 
for 0jeblik. One of them fits the Obliq implementation j3l4) . but does not gu- 
arantee the correctness of object surrogation as defined above. This has been 
formally shown by exhibiting Ojeblik contexts that are able to distinguish the 
terms a. ping and a. surrogate. Similar counterexamples apply to object surro- 
gation in Obliq, as we have tested using the Obliq interpreter P|. In order to 
achieve the correctness of surrogation, we have proposed an improved semantics 
in m. but that work did not contain a proof. 



Contribution of the paper In this paper, we present a 7r-calculus semantics 
for Ojeblik corresponding to the aforementioned variant proposed in |l 6j . We 
also give a notion of contextual equivalence for objects defined in terms of may 
convergence on 7r-processes corresponding to the equivalence =. More precisely, 
our semantics uses Local tt in short Ltt, a variant of the asynchronous 
TT-calculus where the recipients of a channel are local to the process that has 
created the channel. We prove the correctness of surrogation in two parts. 

The algebraic part (Theorem^ relates, with respect to arbitrary 7r-calculus 
contexts, the core component of the translation of a single object in its ping’ed 
and surrogate’d version — both after commitment to the respective request under 
the condition that the request did not arise internally from the object itself. Here, 
we use powerful adaptations of proof techniques, partially known from standard 
TT-calculus and Ltt also to exhibit that the alias-component of the surrogate’d 
version behaves like a forwarder for external requests (Lemma El). Due to the 
unavoidable complexity of the language and its semantics, the proof of Theorem^ 
is non-trivial, but it provides the seeked insight that we gave the proper 7r-cal- 
culus semantics to aliased objects — which actually drove the development of the 
proper corresponding operational semantics to aliased objects in HE). 

The iterative part (Theorem El as conjectured in UBI) relates the may-conver- 
gence behavior of the terms a. ping and a. surrogate, within Ojeblik-contexts. Here, 
we constructively simulate arbitrarily long converging sequences “up to” Theo- 
remEl so the Ojeblik-contexts must guarantee that the requests will be external. 
The main difficulty of Theorem El is that inherently concurrent Ojeblik-contexts 
may non-deterministically prevent either term from eventually committing to 
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the externally requested operation; this also rules out both the must-variant of 
convergence equivalence as well as bisimulation equivalences. 

Summing up, we give (to our knowledge) the first formal proof that migration 
can be correctly implemented in terms of cloning and aliasing. Due to lack of 
space, proofs are sketched or omitted; complete proofs are found in the full paper. 



2 Local 7 t: An “Object-Oriented” 7r-Calcnlns 

Local 7T ca, in short Ltt, is a variant of the asynchronous 7r-calculus |7| where, 
similar to the Join-calculus 1^, the recipients of a channel are local to the process 
that has created the channel. This is achieved by imposing the syntactic con- 
straint that only the output capability of names may be transmitted, i.e., the 
recipient of a name may only use it in output actions. This property makes Ltt 
particularly suitable for giving the semantics to, and reasoning about, concurrent 
object-oriented languages. In particular, we can easily guarantee the uniqueness 
of object identities — a fundamental feature of objects: in object-oriented langua- 
ges, the name of an object may be transmitted; the recipient may use that name 
to access the methods of the object, but it cannot create a new object with the 
same name. When representing objects in the 7r-calculus, this translates directly 
into the constraint that the process receiving an object name may only use it in 
output actions — a guarantee in our setting. 



2.1 Terms and Types 

In Table ^ we consider a typed version of polyadic Ltt extended with: (i) labeled 
values (.ju, called variants with case analysis; (ii) tuple values (ui..u„), 
with pattern matching, (iii) constants k, called keys, with matching. 

To deal with these rather complex values, we introduce several syntactic 
categories. As additional metavariables, we let s,p, q, r, m, t range over channels, 
y over variables, w range over values, Q over processes, and i, j, d, h, m over 
tuple, variant, or other indices. We abbreviate £-{) and £_() with i, as well as 
q{) and q{).P with q and q.P, respectively, while v denotes a sequence vi . . Vm- 

Restriction, both inputs, and both destructors are binders for the names 
x,xi, . . . , Xm in the respective scopes P,P\,. . . , Pm- We assume the usual de- 
finitions of free and bound occurrences of names, based on these binders; the 
inductively defined functions fn(P) and bn(P) denote those of process P. Si- 
milarly, fc(P) and bc(P) denote the free and bound channels of process P. 
Moreover, n(P) = fn(P) Ubn(P) and c(P) = fc(P) Ubc(P). Substitutions, ran- 
ged over by a, are type-preserving functions from names to values (types are 
introduced below). For an expression e, ea is the result of applying a to e, with 
the usual renaming to avoid captures. We write e{Ya;} for a substitution that 
operates only on name x, leaving all the other names unchanged. Relabelings, 
ranged over by p, are functions from labels to labels. We write P\^ /t] to replace 
all occurrences of label £ by label £' in values occurring in term P. Substitution 
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Table 1. TT-Calculus 



Channels: c G C 
Keys: fe G K 

Names: € N 

n :;= c \ k 



Values 

V ::= X 

I £-V 

I (vi..Vn) 



variable 

variant 

tuple 



Auxiliary: u G U 
Variables: G X 

X ::= n \ u 

Labels G L 

£,£i,l2, ■ ■ ■ 



Types 

T :--C{T) 

|K 

I [£i-.Ti-,...;£m-.T^] 
\{Tl..Tm) 

1 ^ 

\piX.T 



channel type 
key type 
variant type 
tuple type 
type variable 
recursive type 



Processes 
P 0 

I c{x).P 
I cv 

I Pl\P2 

I (un\T) P 
I !c(a:).P 

I if [k=ki] then Pi elif [^=^2] then P 2 else P3 
I case V of £i_{xi):Pi ; . . . ; 

I let ( . . Xm ) = V \r\ P 



nil process 
single input 
output 
parallel 
restriction 
replicated input 
key testing 
variant destructor 
tuple destructor 



The locality constraint requires that in {single and replicated) inputs 

and {variant and tuple) destructors the bound names x,xi, . . . , Xm must not 

be used in free input position within the respective scope P,Pi,... , Pm- 



and relabeling have the highest operator precedence, parallel composition the 
lowest. 

Types are introduced for essentially three reasons: (i) they allow us to cleanly 
define some abbreviations, (ii) we use them to give a typed semantics of 0je- 
blik, and (iii) they allow us to formally prove the main result of the paper using 
typed behavioral equivalences. Abusing the notation for sets of names and the 
corresponding types, we use K and C also as type constructors, where channel 
types are parameterized over the type of value they carry. For variants and tuples 
we use standard notations (c.f. m)- In a recursive type /iA. T, occurrences of 
variable X in type T must be guarded, i.e., underneath a variant or channel 
constructor. We often omit the type annotation of restriction, when it is clear 
from the context or not important for the discussion. A type environment F is 
a finite mapping from variables to types. A typing judgment F \- P asserts that 
process P is well-typed in F, and F h v:T that value v has type T in F. There is 
one typing rule for each process construct; each of them is straightforward. (We 
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provide the type system in the full version of the paper.) A type environment F 
is closed if it contains only names, no auxiliary variables. 

2.2 Semantics and Proof Techniques 

We equip our 7r-calculus with a standard reduction semantics. Its rules, defining 
the reduction relation are precisely those of but based on a notion of 
structural equivalence that is extended to deal with if-, case-, and let-constructs. 
For simplicity, we only consider the semantics of well-typed processes. 

Definition 1. Structural equivalence, written =, is the smallest relation pre- 
served by parallel composition and restriction, which satisfies the axioms below: 

— P = Q, if P is a-convertible to Q; 

— P\0 = P, P\Q = Q\P, P\{Q\R) = {P\Q)\R; 

— {va) 0 = 0, {ua) {ub) P = {ub) (i^a) P, 

— \va) {P\Q) = P\ {va) Q,ifa^ fn(P); 

— if [ki=ki] then Pi elif [k=k 2 ] then P 2 else P3 = Pi; 

— if [fc=fci] then Pi elif [^2=^2] then P 2 else P3 = P 2 , if kif^k; 

— if [fc=fci] then Pi elif [k=k 2 ] then P 2 else P3 = P3, if kif^kf^k 2 ; 

case £j_Vj of £i _{xi) .Pi , . . . , £j_{Xj).Pj^ • ■ • 7 ^m-{^rri} -Fm — , 

— let (xi . .Xm ) = {vi..Vm.) in P = P{ 7 x}. 

The relation is the reflexive-transitive closure of — For any relation TZ on 
processes, —>7?, denotes — and =^7?, the reflexive-transitive closure of —>7?,. 

As regards behavioral equivalences, we focus on barbed bisimulation ini, a 
uniform mechanism for defining behavioral equivalences in any process calculus 
possessing (i) a reduction relation and (ii) an observability predicate. Barbed 
bisimulation equips a global observer with a minimal ability to observe actions 
and/or process states but it is not a congruence. By closing barbed bisimulation 
under contexts we obtain a much finer relation. A context C[-] is a process with 
exactly one hole, written [•], where a process may be plugged in. A context 
C\-] is static if it is structurally equivalent to (vd) {P \ [•]), for some P and d. 
We let E[-\ range over static contexts. In an asynchronous scenario, only output 
actions are usually considered observable j2] . A process P has a barb at channel c, 
written Pf^., li P = E[cv\ for some static context E[-], value v, and channel 
c S fn(P). We write P-U-c? if there exists a process P' with P ^ P' and P'i^- 
In a typed scenario, only well-typed context are usually considered. Therefore, 
we recall the notion of {A/ P)-context |17ll9j : when filled with a process P with 
P h P, a (Z\/P)-context C[-] guarantees the typing A h C[P]. 

We often work with channels that have been extruded to the environment. 
To keep track of the fact that they cannot be used in input by the environment 
we generalize the standard typed barbed relations. 

Definition 2 (Barbed Relations). Let C C C. Barbed C-bisimilarity, writ- 
ten =c, is the largest symmetric relation on processes, such that P =c Q implies: 

1. If P — P', then there exists Q' such that Q ^ Q' and P' =c Q' 
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2. If with c ^ C, then QJJ-c- 

Let r be a typing, and P and Q two processes such that P \- P,Q. We say 
that P and Q are barbed I^;C-equivalent, written P Q, if, for each closed 
type environment A and static {A/ P)-context C[-] not containing names in C in 
input position, we have C[P] =c C[Q]. We say that P and Q are barbed P',C- 
congruent, written P =r-,c Q, if, for each closed type environment A and {A/P)- 
context C[-] not containing names inC in input position, we have C[P] C'lQ]- 

If C=0 in Definition |2l we omit C and get the standard definitions of barbed 
bisimilarity =, barbed D-equi valence cs:/’, and barbed D-congruence =r, respec- 
tively. If C={s}, we write =r-,s for =r-,c £md —r-,s for —r-c- Due to the restrictions 
on the contexts, it holds that sv =r-,s 0 and, by asynchrony, s(u).0 =r-,s 0- 

The main inconvenience of barbed equivalences and congruences is that they 
use quantifications over contexts in the definition, and this unnecessarily com- 
plicates proofs of process equality. In the long version of this paper, we provide 
and make use of labeled characterizations. Parts of our proofs are based on the 
generalization (to our setting) of Ltt proof techniques ^2] that are based on spe- 
cial processes called link. A link (sometimes called a forwarder |B|), is a process 
\p{u).qu, abbreviated p>q, that behaves as an unbounded unordered buffer re- 
ceiving values at one end (p) and retransmitting them at the other end (g) . The 
following lemma allows us to always output bound names instead of free names. 
Its proof, in pure Ltt, can be found in HH, but its generalization to our typed 
setting is straightforward. 

Lemma 1. Let P h dv, for some P. Let b £ fc(z;) with P h 6:C(T). Let d ^ c{v) 
and w = v{'^/b}. Then dv =r (^'d:C(T)) {dw \ di>b). 

3 0jeblik: A Concurrent Object Calculus 

The set £ of untyped 0jeblik expressions is generated, as shown in Table |2] 
where 1 ranges over method labels. In this section, we present the 0jeblik’s call- 
by- value semantics, first informally, then formalized using the 7r-calculus of § El 
through which also a standard behavioral semantics is defined. 

Objects. An object [lj=mj]j^j consists of a finite collection of updatable na- 
med methods lj=mj, for pairwise distinct labels Ij. In a method c(s,i)6, the 
letter ^ denotes a binder for the self variable s and argument variables x within 
the body b. Moreover, every object in 0jeblik comes equipped with special me- 
thods for cloning, aliasing, surrogation, and pinging, which cannot be updated. 

Method invocation a.l( ai . . a„ ) with field 1 of the object a containing the 
method <j(s,i)6 results in the body b with the self variable s replaced by the 
enclosing object a, and the formal parameters x replaced by the actual para- 
meters Oi . . a„ of the invocation. Method update a.\<=m overwrites the current 
content of the named field 1 in a with method m and evaluates to the modified 
object. The operation o. clone creates an object with the same fields as the origi- 
nal object and initializes the fields to the same entries as in the original object. 
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a,b ::= O 

I o.l(ai . .a„) 
I a.l<=m 
I o. clone 
I a.alias(6) 

I a. surrogate 
I a.ping 
I s,x,y,z 
I let a; = a in 6 
I fork(a) 

I join(a) 

O ::= [lj=mj]jej 

mj ■.■-q{sj,Xj)bj 



object 

method invocation 
method update 
shallow copy 
object aliasing 
object snrrogation 
object ping 
variables 
local definition 
thread creation 
thread destruction 
object record 
method 



Table 2. 0jeblik Syntax 



The operation a.alias(&) replaces the object a with an alias for b, regardless of 
whether a is already an alias or still an object record; if b itself is an alias to, 
e.g., an object c, then we consequently create, by transitivity, an alias chain. 

The operation a. surrogate turns object a into a local proxy for a remote copy 
of itself, as implemented by the uniform method surrogate=<r(s)s.alias(s.clone). 
The operation a.ping is implemented by another uniform method ping=c(s)s, 
such that it returns of the object o that results from the evaluation of a its 
“current identity”, i.e., due to possible forwarding the current endpoint of an 
alias chain potentially starting at object o. 

Scoping. An expression let a: = a in 6 (non-recursive) first evaluates a, binding 
the result to x, and then evaluates b within the scope of x. 

Concurrency. Computational activity takes place within so-called threads. In 
addition to the main thread, new threads can be created by the fork command. 
The result of a forked computation is grabbed by the join command. 

Self-Infliction, Serialization, Protection. The current self of a thread is the 
self of the last method invoked in the thread that has not yet completed. An 
0jeblik operation is self-inflicted (also: internal) if it operates on the current 
self; otherwise, it is external. 0jeblik objects are serialized: within any object, 
at any time, at most one thread may be active. Serialization is implemented 
by associating with each object a mutex that is locked when a thread enters 
the object and released when the thread exits the object. More precisely, the 
mutex is always acquired for external operations, but never for internal ones; 
this concept allows a method to recursively call its siblings through self, but it 
excludes mutual recursion between objects. 0jeblik objects are protected: the 
critical operations update, aliasing, and cloning, are only allowed if they are 
internal. An operation is called valid if it is internal or not critical. 
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Table 3. Ojeblik Semantics — Clients, Scoping, Concurrency 



3.1 Translational Semantics 

In addition to the core 7r-calculus of Section 0 we use parameterized recursive 
definitions, which can be faithfully represented in terms of replication uni. 

The semantics as presented in Tables 0 and 0 maps 0jeblik-terms into tt- 
calculus terms parameterized on two names: in a term |a]p, the channel p is 
used to return the term’s result, while the key k represents the term’s current 
self, which is required to deal with self-infliction. The essence of the semantics 
is to set up processes representing objects that serve clients’ requests. Different 
requests for operating on objects are distinguished by corresponding 7r-calcu- 
lus labels. We explain the semantics by showing how requests are generated by 
clients, and then how they are served by objects. We omit explanations for the 
semantics of scoping and concurrency; they can be found in the full paper. 

Clients. In Table 01 the current self k of encoded terms is ‘used’ as the current 
self of the evaluation of the first subterm in left-to-right evaluation order. All the 
translations in Table 01 follow a common scheme. For example, in the transla- 
tion of a method invocation | a.lj ( oi . . a„ ) J^, the subterms a, oi . . a„ have to be 
evaluated one after the other: the individual evaluations use private return chan- 
nels q,qi . .Qn, which are subsequently asked for the respective results y,x\ . .Xn, 
but also for the respective new current self i,i\ . . in to be used by the next eva- 
luation — this can be same as for the previous evaluation, but is not necessarily 
so (c.f. the description of object managers below). After the last subterm a„ 
has returned its result, the accumulated information is used to send a suitable 
request with label invj to self-channel y of object a, also carrying the overall 
result channel p and the latest current self in- Thus, the responsibility to signal 
a result on p is passed on to the respective object manager waiting at y. 
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[0]p =*'(i^sf) (^p{s,k) newOo{s,t) Y[ytj{sj,Xj,r,k').lbj}^' ^ 


newOo(s,f) '= {vmeraikekf) 
newAo(s,Sa) {I'memikeh) 


^ m,e OMo( s, nie, rrn, ke, h,t) ^ 

^ AMo( s, me, m-„ ke, h, Sa ) ) 


OMo{s,m,ke,ki,t) s{l,k).{i/k*) 

if [k=ki] then 

case 1 of cln_(r) : 0Mo( s, fh, ke, kf 


\t) \ (i^s*) ( f(s*,k*) 1 newOo( s* ,t) ) ; 



ali_(sa,r) : AMo( s, in, ke, ) \ r(sa, k*) ; 
updj.{t',r) : OMo(s, m, ke, k*,ti . . ..t„) \ r{s, k*) ; 

r) : 0Mo( s, fh, ke, k\t)\t]{s,x,r,k*) ; 
sur _(r) : OMo( s, m, fee, fc*, i ) | | s.alias(s. clone) ; 

png_(r) : OMq{ s, fh, ke, k* ,t) \ [s]J 
elif [k=ke] then 

OMo( s, fh, ke,k* ,t) I case I of cln_(r) : m\{k).fn^ ; 

ali_(sa,r) : m-,{k).nie ', 
upd^_(f',r) : ; 

m\/j_{x,r):CM{tj{s,x,r*,k*)] ; 

sur_(r) : CM[ [ s.alias(s. clone) ]*. ] ; 
png_(r) : CM[[s|J.* ] 

e\seOMo(s,fh,ke,ki,t) \ me-{s{l,ke) \ rn[k)'^ 

CM[-] ‘^= (i^r*)( [•] I r* {y,k').mi{k").{r{y,k”) | ) ) 



AMo( s, fh, ke, h,Se) s{l, k).{vk*) ^ 
if [fc=fci] then 

case I of cln_(r) : AMo( s, fh, ke, k* ,Se) 
ali_(Sa,r) : AMo( s, fh, ke, k*,s'^ ) 
updj_(f',r) : AMo{ s,fh,ke,k*,Se) 
\nvj_{x,r) : AMo( s, m, fee, fc*, Sa ) 
sur_(r) : AMo( s, fh, ke, k*,Sa) 
png_(r) : AMo( s, fh, ke, k* ,Se) 
elif [k=ke] then AMo( s, fh, ke, k* , t 
else AMni s,fh,ke,ki, s 



Sa ) 


1 (i^s*) ( r(s 


Sa) 


1 r{s'^,k*) ; 


Sa ) 


1 Sa(/, ki) ; 


Sa ) 


1 k) ; 


Sa ) 


1 Sa(/, k) ; 


Sa ) 


1 


mi 


{k).{^{l,k) 


me 


{s{l,ke) 1 f 



Table 4. 0jeblik Semantics — Objects 
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Objects. The semantics |0]5 of an object O := [lj=(;{sj,Xj)bj]j^j, as shown 
in Table 0| (again similar to fSj), consists of a message that returns the ob- 
ject’s reference s together with the current self k on channel p, a composition of 
replicated processes that give access to the method bodies | > and a new ob- 

ject process newOo( s,t) that connects invocations on s from the outside to the 
method bodies, which are invoked by the trigger names t. Inside newOo(s,t), 
several private names are needed: mutexes m := me, mi are used for serializa- 
tion; the {internal) key k{ is used to detect self-infliction; the {external) key kg 
is used to implement serialization in a concurrent environment (see later on). 
The behavior of objects and aliases is represented by the object manager OM 
and alias manager AM, respectively, which both, for each request arriving along 
their reference s, first check for self-infliction [k=ki], and then, by simple case 
analysis, for the kind of the request. We first explain how internal requests are 
served in objects and aliases. External requests will be served later. 



Serving Internal Requests [fc=fei] No serialization or protection is required. 

Object Managers (OM). For each held, the manager may activate appropriate 
instances of method bodies (case inv^-: the method body bound to Ij along trigger 
name tj) and administer updates (case upd^: install a new trigger name t'). 
Cloning (case cin) restarts the current object manager in parallel with a new 
object, which uses the same method bodies t, but is accessible through a fresh 
reference s*. Aliasing (case ali) starts an appropriate alias manager AM instead 
of re-starting the previous object manager OM. Surrogation and ping (cases sur 
and png) are modeled according to their uniform method definitions. 

Alias Managers (AM). Local requests for cloning and aliasing are allowed 
and behave analogous to the respective clauses in object managers, but restar- 
ting AM instead of OM. Update, invocation, surrogation, and ping requests are 
immediately forwarded as is to the aliasing target Sa- 

Nonces {k*). To guarantee the receptiveness of objects, managers OM and AM 
always have to be restarted with some possibly changed state. However, seria- 
lization requires that at any moment, only one request shall be active in an 
object. According to our semantics, program contexts will never give rise to 
several competing self-inflicted requests, but, when reasoning within arbitrary 
TT-calculus contexts, as we do in § o their existence must be taken into account. 
Therefore, we add another layer of protection to increase the robustness of se- 
rialization: each time a request enters a manager, a fresh key k* is created to be 
used in the restarted manager; this key must subsequently be used as the current 
self for all activities enabled by the current request. Thus, the consumption of 
one of several competing pending requests renders its competitors external. Note 
that nonces would not be necessary if we were interested in correct behaviors 
within only translated contexts. 



Serving External Requests \k^k\\ Serialization and protection are required. 

In order to clarify the behavior of an object manager serving an external 
request. Figure Q show the five relevant “states”, starting from a free manager 
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Fig. 1. Object Manager Serving External Requests 



that becomes active by some pending request grabbing its serialization-lock. 
Then, if the request is protection-critical it is discarded, otherwise the manager 
commits to it and serves it until explicit termination. In both cases, the manager 
becomes free again by releasing the lock. Note that internal requests can only 
be served in “state” S' of a manager that currently already serves some request. 
In Subsection [I.IIL we explain Figure Q in more detail; for now, it just offers an 
informal device to guide the explanations of the semantics of object managers. 

Serialization kg). As mentioned earlier, mutual exclusion within an 

object is implemented by mutexes, so, upon creation of a new object newO, the 
fresh mutex channel is initialized. According to serialization, the intended 
continuation behavior of an incoming external requests is blocked on We, once 
it enters a manager. The manager itself is immediately restarted and remains 
receptive; it also remains in its “state” according to Figure ^ Arbitrarily many 
requests can be blocked this way and compete for the mutex me once it beco- 
mes available. A successfully unblocked request is resent to the same manager, 
but now carrying another key kg, which allows the manager to detect that the 
request has grabbed the mutex, so the manager can evolve into “state” A. We 
call pre-processing the procedure of intermediate blocking of requests and their 
subsequent reemission with key kg instead. Alongside with the pre-processed re- 
quest, its former current self k is stored on the (internal) mutex m, for recovery 
after termination. This recovery is actually necessary since the original current 
self k is possibly required for use later on by the sender of the request. 

Nonces (k*). Pre-processing must not reinitialize the key k, of the restarted 
manager: a currently self-inflicted operation interleaved by pre-processing might 
be hindered to proceed, because it could unintendedly become external. 

Object Managers (OM). Cloning, aliasing, or update, are critical operations. 
Once a respective pre-processed request is consumed, the manager evolves from 
“state” A into “state” D\ the request and its former current self k, stored on 
channel mi, will be simply ignored when releasing by consuming 7fT{k. 

Invocation, surrogation, and ping, are non-critical operations. Once a respec- 
tive pre-processed request is consumed, the manager evolves from “state” A into 
“state” S implying that no other external request shall be served (apart from 
pre-processing) until the current one has terminated. In order to be notified of 
that event, we employ a call manager protocol, represented by the context CM[-]: 
instead of delegating to some other process the responsibility of returning a result 
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A,B [\j:Bj^Bj]j^j 


object record type 


1 Thr(A) 


thread type 



R(X) = C(X,K) 

M(Bi . . Br^^B) ^=\b^ ] . . [B„],R([B]) 

l[\j:Bj^Bj]j^j] = 



' cln 
all 


R(A) ] 

(A,R(A)) 


upd^ 


(C((A,M(R^^R0,K),R(A))) 


invj 


{M{B,^Bj)) 


sur 


R(A) 


_png 


R(A) J 



[Thr(A)r='C((R(A),K)) 



Table 5. Translation of 0jeblik-types 



on r, a fresh return channel r* is created to be used within [•] in place of r, such 
that the result will first appear on r*. Until this event, other external requests 
remain blocked, while internal request may well be served. After this event, the 
manager evolves from “state” S into “state” T, where the former current self can 
be grabbed from mj, the result y be forwarded to the intended result channel r 
(along with the former current self), and the mutex rUe be released. Note that 
externally triggered method bodies [bj], and also surrogation and ping bodies 
I s.alias(s.clone) ] and | s], are all run in the context of the nonce k*, which is now 
the internal key of the OM, so their further calls to s will be self-inflicted. This 
is essential for surrogation, since cloning and aliasing are only allowed internally. 

Alias Managers (AM). According to our discussion in [I ti] . external requests 
that arrive at an active alias manager should be blocked until the current activity 
finishes and the lock rrie is released. Once this happens, all external requests 
are — after three intermediate steps via channels We, s, and m; — forwarded to 
the aliasing target Sa- The pre-processing of requests, presumably superfluous 
in alias managers, is necessary also there, because there may be pending pre- 
processed requests that have come to existence when s was connected to an OM. 



Type Translation Ojeblik is equipped with a standard static type system m- 
The translation of terms into our 7r-calculus is accompanied with a straightfor- 
ward translation of the corresponding types in Table 0 Its importance is that 
(i) the two translations together preserve typings (see the full paper for details), 
and (ii) that we can exploit the type information in proofs of properties of 0je- 
blik-terms, not least by applying typed barbed relations. 
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3.2 Behavioral Semantics 

A standard way to define program equivalence is to compare the behavior of 
programs within arbitrary program contexts, as, for example, shown in previous 
work on the Imperative Object Calculus (IOC) . In our setting, an 0jeblik 
context C\-] has a single hole [•] that may be filled with an Ojeblik term. In 
the remainder of the paper, we assume that Ojeblik-contexts always yield well- 
typed terms when plugging some Ojeblik-term into the hole. As a simple notion 
of program behavior to be tested based on our Ojeblik-semantics, we choose 
the existence of barbs as in § 0 This closely follows the intuition that a term 
|a]p should tell its result on name p as soon as it knows it. So, an 0jeblik term 
converges, if its semantics may report its result on the name p. 

Definition 3 (Convergence). If a € C is an 0jeblik term, then alj. i/|a]p(lp. 



Definition 4 (Behavioral equivalence). Two 0jeblik terms a,b G C are be- 
haviorally equivalent, written a = b, if C[a] (1 iff C[b] (1 for all contexts C'[-]. 

Note that this equivalence is based on a may-variant of convergence. With re- 
spect to our goal of reasoning about surrogation, mast-variants of convergence 
would be too strong, because, in a concurrent language with fork, threads may 
nondeterministically affect the outcome and convergence of evaluation. 



3.3 Properties of the Translational Semantics 

An important advantage of using a typed Ltt semantics is the easy provability 
that object managers are the unique receptors for their (bound) self-channels. 

Lemma 2 (Uniqueness of Objects). Let a be an 0jeblik term. //|a]p Z 
with Z = {vz) ( M I OMo( s, . . . ) ) or Z = {uT) ( M \ AMo( s, . . . ) ), then s Gz 
and s does not appear free in input position within M. 

We now analyze, referring to Figure ^ how the shape of a particular object 
manager and its surrounding context evolves during computation. We will need 
a particular case thereof (Lemma E|) later on in the proof of Theorem 0 

Observation 1: Pre-processing does not change the “state” of object managers. 
At any time, an object/alias manager is ready to receive a request s{l,k) with 
ke^k^k,. The manager is identically restarted afterwards, but will have spawned 
a process me.( s{l, ke) \ fnlk ) that replaces the consumed request. Let us assume 
requests svj with Vj := ( Ij, kj ) for l<j<h (and v:=vi. .Vh) are pre-processed by 
the object manager OMo{s,me,mi,ke,ki,t), so ke^kj^k, for all l<j<h. Then: 

PPo{s,me,mi,ke,v) n rne.{s{lj,ke)\fn[kj) 

i<j<h 



Local TT-Calculus at Work: Mobile Objects as Mobile Processes 



403 



Observation 2: In object managers, k, may be extruded, me,rrii,ke may not. 
Assume that an inv^-request along s appears at OMo( s, me, mi, ke,ki,t), is pre- 
processed, gets the mutex me and re-enters along s with fee- During this, ac- 
cording to the semantics, a fresh internal key k* is created and extruded to 
the corresponding method body. The names n := me,rrii,ke are never extru- 
ded; they constitute the proper boundary of a manager during computation. 
This observation also provides the formal basis for Figure ^ a term Z contai- 
ning an object manager OMo( s,n,ki,t) corresponds to “state” F, A, D, S, or T, 
respectively (for this manager), if after moving — as far as possible — all top-level 
parallel components of Z outside the scope of {I'n) the remaining component 
of Z inside the scope has a characteristic shape. In the full paper, we show the 
complete analysis; here, we outline two special cases: free object managers in 
“state” F, and committing object managers ready to evolve from “state” A into 
“state” S. 

Observation 3: An object manager is free, if its external mutex is available. 
In our semantics, a manager is willing to grant access, if the external mutex me 
occurs unguarded in the term that describes the current “state” , so the general 
shape of a free object (and analogously alias) manager is: 

freeOo( s, h,t,v) \ OMo( s,n, k,,t) \ PPo( s,n,v) ) 

freeAo(s,A:i,Sa,t') m^ | AMo( s, n, fcj, Sa ) | PPo(s,^,u) ) 

where the keys mentioned in F of PPo( ■ . . ) neither match kg nor k,. Note that 
newOo( s,t) = {uki) freeOo( s, fci, t, 0 ), and analogously for newAo( . . . ). 

Observation j: An object manager is ready to commit, if it may consume 
a valid pre-processed request. The following lemma derives from the ability to 
commit to a valid external request — visible as the availability of a valid pre- 
processed request, i.e., a request carrying kg — the shape of the object manager 
before and after commitment, including all of its current pre-processed requests. 

Lemma 3 (Committing Object Manager). Let a G C and |a|p ^ Z. If 
Z = E[s{l,kg) \ OMQ{s,n,ki,t) ] with I G {inVj_(ai, r), png_r, sur_r}, and n = 
mg, mi, kg, then Z ^ Z' for 

Z = E[ {v>h)[m[k' \ PPo{ s,n,v) \ OMo{s,n,ki,t) \ s{l, kg) ) ] 

Z' = E[{u7i){m-ik' I PPo(s,n,?^) I {uk*){OMo{s,n,k*,t) \ CM[ A,(s)^: ] ) ) ] 

for some key k' , some set v of pre-processed requests, and Xi{s) denoting the 
respective continuation behavior of Table^ 

Note that the k* in Z' is fresh, so it can be extruded over PPq( . • . ) and ruik' . 
As special cases, for I G {png_r, sur_r}, of committed object managers, we define 

F[-] (i/nfc*)(m[fc| PPo(s,n,u) | OMo{s,n,k*,t) | [•] ) 
p\ngOo{s,r,k,t,v) =*' F[CM[ |s]^.* ]] 
surOo( s,r,k,t,v) F[CM[ | s.alias(s. clone) ]*. ]] 

and discuss their properties in Section 14. 'kl 
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4 On the Safety of Surrogation 

In |in|, we motivated the equation a. ping = a. surrogate for contextual equiva- 
lence = based on convergence as a valuable goal for proving safety of surrogation. 
Its interpretation is that an object a, if it is responsive to a ping-request, beha- 
ves the same as the object a after surrogation. One of the main observations 
in m was that the desired equation can not hold in full generality for Ojeblik- 
contexts C[-], in which the operation x. surrogate could occur internally. The 
reason is that, after internal surrogation, an object may misuse by intention the 
old and new references to itself. Actually, the advice to avoid internal surroga- 
tion is analogous to the fact that programmers, knowing that cc=0, should never 
use division by x. In constrast, external surrogation corresponds to the case 
where a program receives x from some other module, so it should be guaranteed 
that X will never be 0. Analogously, we conjectured in m that in our semantics 
external surrogation is guaranteed to be safe. 

In this section, we prove that C[a:.ping]lJ. iff C[x.surrogate]lJ. for precisely 
those cases where C'[-] will never lead to self-inflicted occurrences of x. surrogate. 
Although this is an undecidable criterion ^ , we may still formalize it in terms of 
our TT-calculus semantics, as we do in Subsection ED for its use in formal proofs. 
In Subsection EH we study the behavior of objects before and after surrogation 
within tightly restricted contexts and prove them to be barbed equivalent. In 
Subsection 14., 41 we then give an outline of the formal proof for the safety of exter- 
nal surrogations. The full paper also offers a static type system that guarantees 
that surrogations will never be internal. The full paper also offers a static type 
system that guarantees that surrogations will never be internal. 

4.1 On the Absence of Internal Surrogation 

Here, we study how to formalize that C[-] will never lead to self-inflicted occur- 
rences of the term x. surrogate, when plugged into the hole. 

Recall from the 0jeblik semantics in §E|that in a particular state |a]p => Z 
in the computation of an arbitrary 0jeblik term a, a particular sur-request is 
self-inflicted, \i Z = E[ s(sur_r, k) \ OMo( s, fh, kg, ki,t) ] with k=k[, because it is 
ready to enter the OM with k=k[ (c.f. Tabled . Since we must ensure that a sur- 
request never leads to internal surrogation, we must quantify over all derivatives 
of |a]p and check for self-infliction in each of them. 

Note that, starting from the term | C[x. surrogate] J^, we should not be con- 
cerned with arbitrary sur-requests that appear at top-level during computation, 
but only with those that “arise from the request in the hole”. However, this 
property is hard to determine for two different reasons: (1) All of the names 
mentioned in a sur-request may be changed dynamically by instantiation: s (due 
to forwarding), r (due to a call manager protocol), and k (due to pre-processing). 
(2) We have to consider arbitrarily many duplications of the request in the case 
that the hole appears, at the level of 0jeblik terms, within in a method body, 
which leads to replication in the 7r-calculus semantics. For both reasons, we need 
a tool to uniquely identify the various incarnations of the request. 
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Let operate G {ping, surrogate}, and let op G {png, surj denote the correspon- 
ding TT-calculus label (c.f. Table El- We introduce the additional 0jeblik labels 
operate* G {ping*, surrogate*}, writing Ct for the resulting language. The intui- 
tion is that tagged labels are semantically treated exactly like their untagged 
counterparts, but can syntactically be distinguished from them. Consequently, 
we give a tagged semantics, written | |, by adding the respective clauses for tag- 
ged labels, which are just copies of the clauses for the untagged labels; we use the 
tagged TT-calculus labels op* G {png*,sur*} at the semantics level. As a result, 
both tagged and untagged requests can be sent to object and alias managers; 
object managers ignore the tagging information of requests and treat op*-and 
op-requests identically, but alias managers preserve the tagging information since 
they simply forward requests. We also add a tag to all parameterized definitions 
and abbreviations when considering the tagged semantics. 

The semantics is not affected by including tagging information. 

Lemma 4. Let x he an 0jeblik variable and C[-] an untagged 0jeblik context. 
Then: C[x. operate]!} «if | operate*] 1^!}^. 

However, tagging helps us to detect all “requests arising from the hole” . 

Definition 5 (Safe Contexts). Let x be a variable and C\-] o,n untagged 0je- 
blik context. Then, C[-] is called safe /or a;. surrogate, (f | C[a:. surrogate*] 

E\ s(sur*_r, k) \ OMo( s, m, fee, ki,t) ] implies that k ^ k,. 

We replay the definition using ping instead of surrogate. By definition of the 
semantics, an 0jeblik context C[-] is then safe for a:. surrogate if and only if it is 
safe for a;. ping. For convenience, by abuse, we simply call C[-] to be safe for x. 

4.2 Committed External Surrogation is Transparent 

Our main result focuses on external surrogations. In the following we show that 
the two versions of an object manager at s that are committed to an external png- 
and sur-request, respectively Ic.f.S id.dll . are related by typed barbed equivalence. 

Theorem 1. Let T \- surOo{s,r,k,t,v) and T \- p\ngOQ{s,r,k,t,v) . Then: 
surOo{s, r,k,t,v) c^r-s pingOo(s,r, fc,!,/). 

The proof of Theorem Q requires several strong lemmas. Lemma 0 proves that 
surrogation results in an alias pointing to a clone of the old object. Its proof hea- 
vily relies on the nonces (c.f. pageEHSI) used in the implementation of object and 
alias managers, which control the interference with the environment. Lemma El 
proves that the aliased object manager appearing in Lemma El behaves as a for- 
warder. Lemma0uses LemmaHto prove correctness of inlining. LemmaElproves 
that pre-processing external requests does not preclude other requests. Lemma 0 
involves two confluent reductions from right to left along r* and m,, respectively. 
Finally, Theorem 0can be established by applying the previous lemmas. 
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Lemma 5. If F is a suitable type environment for the proeesses below, then: 
surOo{s, r,k,t,v) ~/-;s {v>s*){{vki)freeAo{s,ki,s*,v) \ newOo(s*,t) |r(s*,/c)). 

Lemma 6. Let v := vi. . Vn, and Vj:={ Ij, kj ) for l<j<n. If F is a suitable type 
environment for the processes below, then: 

{v>ki)freeAo{s,k[,s* ,v) e=:r-,B s> s*\ H 

l<_?<n 



Lemma 7. Let P be a process and s a channel such that s ^ fc(L’). If F is a 
suitable type environment for the processes below, then: 

{us*){s>s* \P) 

Lemma 8. Let v := v\. .Vn with Vj:={ Ij, kj ) and kj^k-, for l<j<n. If F is a 
suitable type environment for the processes below, then: 

n svj I newOo(s,i) ~r;s {v>h)freeOo{s,ki,t,v). 

l<j<n 



Lemma 9. Let v := vi-.Vn with Vj\={lj,kj) and kj^k, for l<j<n. If F is a 
suitable type environment for the processes below. Then: 

r{s,k) I {vki)freeOo{s,ki,t,v) pingOo( s, r, fc,f, ). 



4.3 External Surrogation is Safe 

We prove our main theorem that x.ping = a;. surrogate for safe contexts C[-]. 

Theorem 2 (Safety). Let X be an object variable and C'[-] an untagged context 
in 0jeblik. If C[-] is safe for x, then (^[x.ping])). ijff C'[a;.surrogate]U.. 

Proof. (Sketch) By Lemma 0 our proof obligation is equivalent to: 

|C[a;.ping*]|*J)p iff | C[a:.surrogate*] 

This allows us to make use of the assumption on the safety of context C[-]. 

Since the semantics 1 1 is compositional, there is a 7r-calculus context £)[•] 
and names y,j,q, such that |C[a;. operate*] = D[y {op* where £>[•] itself 

does not contain any message carrying a tagged request. We prove that 

D[y {png*. q,j)]ijp iff D[y{sur* .q, j)]ijp. 

and concentrate on the implication from right to left. The converse is analogous. 

Assume that D[y {sur* _q,j)] fj.p. If D[A^]U.p for every process N, then this is 
also the case for N = y(png*_g, j); otherwise, the sur*-request must contribute to 
the barb. Therefore, we assume D[y {sur* _q, j)] => P ip and show that there is 
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a corresponding -D[y(png*_g, j)] Q ip where Q = /sur*]- Since typed 

barbed equivalence c^r and relabeling preserve convergence, this suffices. 

We distinguish between insignificant and significant transitions, where the 
former can be mimicked easily, while the latter require some work. Recall that a 
reduction step due to an external request is committing (c.f. § lOl . if it re- 
presents the consumption of a pre-processed request by an object manager. 
Now, we combine this characterization with the fact that we have to concen- 
trate on surrogation requests arising from the hole within the reduction sequence 
D\y{op* , q, j)] P ip and call significant ( — >s) precisely those steps that exhi- 
bit the commitment to an external op*-request. The other steps — except for the 
cases of internal surrogation, which are precisely excluded by assumption — are 
insignificant: they can even be mimicked up to structural equivalence. 

In order to make the proof work, we iterate the simulation steps along the 
given sequence D[y {sur* _q, j)] P ip. Let us assume that this sequence has d 
significant steps and that we have iterated already h—1 of them, for 0 < h < d: 



D[y {sur* _q,j)] ( 



N/l-l 



Ph-1 



Ph = Li[surOo(s?i,...) ] 



By (the tagged counterparts of) Lemma El we can precisely localize the state of 
the committed object manager inside Ph after the significant step (c.f. S n.'Zl . 
With respect to the relabeling p := [p"® /sur*], by assumption and iteration, we 
also have the sequence: 



L>[y(png*-g,j)] ( : 



\h-l^ 



—r 



Ph-iP 






Qh = E[ pingOo(s/t, ■ • ■) ]p 



by consuming a png*-request. Now, we apply (the tagged counterparts of) Theo- 
rem^and LemmaEJ and the fact that barbed equivalence implies structural 
equivalence = and is preserved by relabeling and get Qh —r PhP- This means 
that we can mimic the significant steps, thus the whole sequence, up to 
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Abstract. We propose an interpretation of a typed concurrent calculus 
of objects based on the imperative object calculus of Abadi and Cardelli. 
The target of our interpretation is a version of the blue calculus, a variant 
of the TT-calculus that directly contains functions, with record and first- 
order types. We show that reductions and type judgments are derivable in 
a rather simple and natural way, and that our encoding can be extended 
to recursive and self-types, as well as to synchronization primitives. We 
also use our encoding to prove some equational laws on objects. 



1 Introduction 

In the recent past, there has been a growing interest in the theoretical foun- 
dations of object-oriented and concurrent programming languages. One of the 
means used to explain object-oriented concepts, such as object types or self- 
referencing for example, has been to look for an interpretation of these concepts 
in simpler formalisms, such as typed A-calculi. However, these interpretations 
are difficult, and very technical, due to the difficulties raised by the typing, and 
subtyping, of objects. To circumvent these problems, Abadi and Cardelli have 
defined a canonical object-oriented calculus, the ^-calculus [1], in which the no- 
tion of object is primitive, and they have developed and studied type systems 
for this calculus. 

In this paper, we give a model of concurrent object computation based on a 
modeling of objects as processes. We introduce some derived notations for objects 
and we give their translation in a version of the blue calculus, tt* [5], extended 
with records. We type Blue calculus processes using an implicit, first-order type 
system based on the simply typed A-calculus. 

Using these derived constructs, we give an interpretation of a concurrent 
and imperative version of ^ defined by Gordon and Hankin, conc^ [13]. We 
prove that this interpretation preserves reduction, typing and subtyping judg- 
ments. Therefore, our encoding gives an interpretation of complex notions, such 
as method update or object types, in terms of more basic notions such as records, 
field selection and functional types. Consequently, we obtain a type-safe way to 
implement higher-order concurrent objects in the Blue calculus, and therefore in 
the TT-calculus (tt). Moreover, we can validate possible extensions of conc^ and, 
what is more original, we can use the embedding of conc^ in the Blue calculus 
to do equational reasoning on the source calculus. As an example, we sketch the 
proof of an equational law between objects at the end of this paper. 
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We organize the rest of the paper as follows. The next section introduces the 
Blue calculus using very simple intuitions taken from the A-calculus execution 
model. This is an occasion to give an informal and intuitive presentation of the 
Blue calculus to the reader. Section 3 briefly introduces Gordon and Hankin’s 
calculus of objects and gives its interpretation in tt*. We prove that conc^ is 
embedded in tt* and that objects can be viewed as a particular kind of (linearly 
managed) resource. Section 4 is dedicated to the typing of processes and objects. 
It introduces a new type operator that is very well suited for typed continuation 
passing style transformations. Before concluding, we look at possible applications 
of our interpretation. Complete definition of the calculi and omitted proofs may 
be found in a long version of the paper [8]. 

2 The Blue Calculus 

In the functional programming world, a program is ideally represented by a 
A-calculus term, that is a term generated by the following grammar: 

M,N ::= a; | \x.M \ {MN) 

We enrich this calculus with a set of constants: ai, 02 , . . . , called names, that 
can be interpreted as resource locations. We describe a very simple execution 
model for programs written in this syntax based on the notion of abstract ma- 
chine (AM), and we enrich it until we obtain a model that exhibits concurrent 
behaviors similar to those expressible in the 7r-calculus. This abstract machine 
sets up the foundation of the Blue calculus that can therefore be viewed, at the 
same time, as a concurrent A-calculus and as an applicative 7r-calculus. 

An abstract machine is defined by a set of configurations, denoted 1C, and a 
set of transition rules, /C — > /C', which define elementary computing steps. In our 
setting, a machine configuration is a triple {S;M;S} where is a memory, or 
store, that is an association between names and programs; M is a program, that 
is a A-term; 5 is a stack containing the arguments of functional calls. Initially 
an AM has an empty memory, denoted by the symbol e, which can be extended 
with new declarations as in {£ \ (a„ = N)). The stack has a similar structure 
and we use (aj,S) to denote the operation of adding the name aj to the stack. 

An execution of the functional AM starts in the initial configuration ICq, with 
an empty stack and memory (/Co = {e; M; e}). The transition rules are defined as 
follows, where M{x^aj} denotes the outcome of renaming all free occurrences 
of X in M to the name Oj. 

{£-,af,S}^ {£;Nf,S} {£ = ■■■ \ {a, <= N,) \ •••) 

{£; Xx.M; (aj,S)} ^ {£■, M{x^aj};S} 

{£-,{MNy,S}^ {{£ I {an = A^)); M; (a„,5)} (a„ fresh name) 



For example, to evaluate a function application we memorize the argument 
in a fresh memory location, and we add the name of this location to the stack. 
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At each computation step, the machine is in a configuration of the kind: 

K-n = {((ai = iVi) I • • • I (a„ = Nn))-,M; . . ,a*J} 

Where the indices ij G l..n for each j G l..k. Each configuration corresponds 
to a A-term and, for example, /C„ corresponds to {Mai^^ . . . ai^){ai^Ni} . . . 
{an^Nn}- Therefore, to each extension of the functional AM corresponds a 
generalization of the A-calculus. In this section, we improve the functional AM 
until we obtain an execution model that compares to that of tt. The calculus 
defined by the extended AM is the Blue calculus. 

We start with simple syntactical modifications. We modify our notations to 
use a sequence of applications instead of a stack, recasting the standard con- 
figuration K-n into: (oi = Ni) | ••• | (a„ = Nn) \ {Mai^ With these 

modifications, we can reformulate the transition rules in the following way, with 
the side condition that the name a is fresh in rule (x): 



(a = iV) I • • • I (aai ...an) 

{\x.M)ai . . .an — *■ 
{MN)ai . ..an —> 
IC-* 1C ^ 



(a = iV) I ••• I (iVai...a„) 
{M{x^ai})a2 ...an 
{a = N) \ Maai . . . a„ 

((a = iV) I /C) ^ {{a = N) \ 1C') 



ip) 

m 

(x) 

(w) 



In this new presentation, rule (/3) corresponds to a simplified form of beta- 
reduction, where we substitute a name, and not a term, for a variable, whereas 
rule (p) can be interpreted as a form of communication. Nonetheless, whereas the 
classical 7r-calculus communication model is based on message synchronization, 
we use instead a particular kind of resource fetching. 

A first improvement to the AM is to consider | as an associative composition 
operator, and to allow multiple configurations in parallel. We do not choose a 
commutative operator. The idea is to separate in each configuration, the store 
from the active part, that is, to separate the memory from the evaluated term. 
We allow some commutations though, with the restriction that the evaluated 
term is always at the right of the topmost parallel composition. More formally, 
we consider the following structural rules for parallel composition, where P Q 
means that both P ^ Q and Q P holds. 

(Ml I M2) I M3 hr Ml I (M2 I M3) (Ml I M2) I M3 hr (M2 I Ml) I M3 



As a result, we obtain an asymmetric parallel composition operator, like the 
one defined in conc^, or the formal description of CML [10]. Another conse- 
quence is that we can replace rule (p) by the simpler rule: 

{a = N) \ (aai...an) {a = N) \ (iVai...a„) (2.1) 

Roughly speaking, we have transformed our functional AM to a Chemical AM 
(CHAM) in the style of [4]. The most notable improvement is the possibility to 
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compose multiple configurations and, for example, to define configurations with 
multiple declarations for the same name. Indeed this introduces the possibility 
of non-deterministic transitions, such as: 

{{a = iVi) I (a = N 2 ) I a) ^ ((a = N,) | (a = N 2 ) | IVi) 

((a = iVi) I (a = N 2 } I a) ^ ((a = Ni) \ {a = N 2 ) \ N 2 ) 

Another improvement to our concurrent AM is the addition of a new kind of 
declaration, that is discarded after a communication. We denote (a P) this 
declaration, and we add the following communication rule. 

(a 4= iV) I (aai . . . a„) ^ {Nai . . . a„) (2.2) 

Intuitively, the declaration (a N) allows us to control explicitly the num- 
ber of accesses to the resource named a, like the input operator a{x).P in tt, and 
thus it allows us to capture the evaluation blocking phenomena that are peculiar 
to concurrent executions. In the encoding of concurrent objects in tt*, we will 
see that objects also appear as a particular kind of declarations. 

Finally, we add the possibility to dynamically create fresh names. This mech- 
anism is a distinctive feature of the 7r-calculus, and it is very easily implemented 
in our CHAM by adding the restriction operator, {va)JC, together with new 
reduction rules. Using the restriction operator we can, for example, define the 
internal choice operator (M0A*) to be the term (j/a)((a M) | (a A') | a). 



2.1 The Calculus 

The Blue calculus can be viewed as the calculus obtained from the concurrent 
AM, in the same way that the join-calculus is derived from the reflexive CHAM 
defined in [12]. The following table gives the syntax of processes, P. The syntax 
depends on a set of atomic names, Af, ranged over by a, 5, . . . and partitioned in 
three kinds: variables x,y,. . . , bound by abstractions, (\x)P; references u,v,. . . , 
bound by restrictions, (yu)P-, labels k,l,. . . , used to name record fields. 

Pro cesses 

P,Q ::= process 



a 


name 


{Xx)P 


small A-abstraction 


(Pa) 


application 


{P 1 Q) 


parallel composition 


(vu)P 


name restriction 


{u <= P) 


linear declaration 


{u = P) 


replicated declaration 


[] 


empty record 


[P,l = Q] 


record extension 


{P-l) 


selection 
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Our syntax enforces a restricted usage of names with respect to their kinds: 
we only allow declaration on references and abstraction on variables. This rule 
out terms such as (\x){x ^ P) for example. 

The formal operational semantics of tt* is given in a chemical style and 
defined using two relations. First, the structural congruence relation =, which is 
equivalent to the relation used previously, and that is used to rearrange terms. 



Second, the reduction relation, which represents real computation steps, and 

that corresponds to (2.1), (2.2) and (/3). 

Structural congruence 


'P = P 




(Struct Refi) 


Q = P ^ P = Q 




(Struct Symm) 


P=Q,Q = R^P = R 




(Struct Trans) 


P = Q ^ (\x)P = (\x)Q 




(Struct Lam) 


P = Q ^ (Pa) = (Qa) 




(Struct Appl) 


P=Q ^ {P-l) = {Q-l) 




(Struct Sel) 


P=Q ^ {P \ R) = {Q\ R) 




(Struct Par) 


P = Q ^ (vu)P = (vu)Q 




(Struct Res) 


P = Q ^ (u P) = (u Q) 




(Struct Decl) 


P=Q ^ {u = P) = {u = Q) 




(Struct Mdecl) 


P=Q,R=S^ [P,l = R] = [Q. 


1 = S] 


(Struct Over) 


iiP 1 Q) 1 i?) = (P 1 (g 1 R)) 




(Struct Par Assoc) 


((P 1 g) 1 P) = ((g 1 P) 1 P) 




(Struct Par Comm) 


u ^ fn(g) ^ (vu)P Q = (vu){P 


Q) 


(Struct Res Par) 


u ^ fn(g) ^ g (vu)P = (vu){Q 


P) 


(Struct Par Res) 


{vu)(vv)P = {vv)(i'u)P 




(Struct Res Res) 


(P 1 g)a = P 1 {Qa) 




(Struct Par Appl) 


{P\Q)-l = P\ {Q-l) 




(Struct Par Sel) 


a ^ u ^ {(i'u)P)a = (vu){Pa) 




(Struct Res Appl) 


{(vu)P) ■ 1 = {vu){P ■ 1) 




(Struct Res Sel) 


Reduction 


'p-^ P' ^ {Pa) {P'a) 




(Red Appl) 


P^ P' ^ {P-l) ^ (P' -1) 




(Red Sel) 


p ^ p' ^ (p 1 g) -- (p' 1 g) 




(Red Par 1) 


g - g' ^ (p 1 g) (p 1 q ') 




(Red Par 2) 


P P' ^ (VU)P (vu)P' 




(Red Res) 


p ^ p', p = g ^ g -> p' 




(Red =) 


{(\u)P)v P{u^v} 




(Red Beta) 


(■u 4= P) {uai . . . On) — > (Pai . . . c 


^n) 


(Red Decl) 


{u = P) \ {uai . ..On) ^ {u = P) 


1 {Pai ...an) 


(Red Mdecl) 


[P,l = Q]-l-^ Q 




(Red Sel) 


k^l ^ [P,l = Q]-k^ P-k 




(Red Over) 
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Notations. We use d to denote the sequence oi, . . . , a„ and fn(P) to denote the 
set of free names in P. We abbreviate a sequence of abstractions, (Axi). . . (Xxn)P, 
into (Xx)P. The same convention applies for (vu)P. We also abbreviate a se- 
quence of extensions, [[[], = Pi ],..., = -Pn ], where /i, . . . ,ln are pairwise 

distinct labels, into [k = Pi*^^' "], and a sequence of applications, Poi . . .a„, 
into Pd. 



2.2 Derived Operators 

To simplify the presentation of our encoding and of the type system, we introduce 
three derived operators. 

defu = Pin(5 = (vu){{u = P) \ Q) definition 

setx = P \nQ = {vu)[{u <= {Xx)Q) \ (Pu)) linear application 

reply (a) = (Ar)(r a) synchronous message 

We may interpret the first operator, subsequently called a definition, as an 

explicit substitution of P for the name u in Q, and we can define higher-order ap- 
plication, as a shorthand for def u — Q in where u ^ fn(P^ U fn(Q). 

Note that the name u is recursively bound in def u = PinQ, and that it is pos- 
sible to define recursion, rec u.P, by (def u — Pinu). Using linear application, 
it is possible to define sequential composition, P ; Q, as (setu = P inQ), for 
some u not free in Q. The reply operator is used in continuation passing style 
encoding and, for example, to return a value in a linear application. 

setx = reply(a) inQ ^ {vu){{u 4= (Xx)Q) \ (ua)) ^ Q{x^a} 

We can compare reply(a) with the synchronous names of the join-calculus, 
with the difference that, using our notation for higher-order application, it 
is possible to “reply” a general term, with reply(P) standing for the term 
def ■u = P in (Xr){ru). 



3 Interpretation of the Concurrent Object Calculus 



The calculus conc^ is a calculus based on the notion of naming, obtained by 
extending the imperative ^-calculus with 7r-calculus primitives, such as parallel 
composition and restriction. As the imperative ^-calculus, it also provides an 
operator to clone an object, and a call-by- value definition operator: let x = ainb. 



Expressions and results 

u, V ::= 

X 

P 

d::= 

{k = 



result 

variable 

name 

denotation 
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u 

{P d) 
u-l 

u-l ^ <;{x)b 

clone (m) 
let X = ainb 

(a f" b) 

(i^p)a 



term 

result 

denomination 
method invocation 
method update 
cloning 
sequencing 
parallel composition 
name restriction 



The basic constructor of conc^ is the denomination, (p i— > d), that, infor- 
mally, adds a name to an object of ^ and acts like a special kind of declaration. It 
represents the store of an object-oriented program, like the declaration (a = M) 
represents the store of a configuration in the functional AM. We omit the formal 
definition of conc^ in this paper, but we hope that the reader unfamiliar with 
this calculus can grasp some idea of it from our interpretation. Its operational 
semantics is defined in a chemical style, with a structural equivalence, =, anal- 
ogous to the homonymous relation in tt*, and a reduction relation, — Some 
reductions of conc^ come from the let constructor, where values are names. 
For example, (let a; = p\nb) b{x^p}. However, the basic interactions are be- 
tween a denomination, and a method invocation, a method update or a cloning 
on its name. Assume d is the denotation {li = we get that: 

j e l..n 

{p ^ d)t p-lj ^ {p 1-^ d) f bj{xj^p} 

d' = {k = = c(a:)6} j € l..n 

{p ^ d)t {p-lj ^ ‘^{x)b) ^ {p d') f' p 

q ^ fn(d) 

{p ^ d)t clone(p) — > (p i— > d) (vq){{q ^ d)t q) 

We interpret conc^ in the Blue calculus and we prove an operational corre- 
spondence result. In the process of defining the encoding of conc^, denoted |.] 
hereafter, we will naturally introduce some derived operators for the object no- 
tations. We see that it allows regarding conc^ as embedded in the Blue calculus, 
and therefore tt* as an object calculus. 

We suppose that the conc^ names are included in tt*. Informally, the inter- 
pretation of a denomination (p i— > d), where d—{li = <;{xi)bi^ '”}, is a process 
modeling a “reference cell” that memorizes n values, (Aa;i)|&i], . . . , (Aa;„)[d„]. 
That is a recursively defined declaration of the name p, which encapsulates a 
record with 2n fields: the access field pet;., used to invoke method k; the field 
puti., used to modify this method. We also add a field named done that, when 
selected, creates a fresh cell with a copy of the current state. Schematically, we 
use the split-method technique of [2]. 
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Let R(p, s,i,c) be the following record (of tt*): 
geti. = {sx I Xip), 

puti. = (Xy){sxi . . .Xi-iyXi+i . . .Xn \ reply(p)), 
clone = {sx \ cx) 

In our intuition, the identity of an object is a reference at which the object 
state can be fetched (the name p) , its state is a record of methods as in the clas- 
sical recursive records semantics [6] and encapsulation is naturally implemented 
using the def operator. In particular, the variable Xi is used to record the value 
of the method k, the name s is a pointer to a function that (given the Xi’s) 
creates the object each time it is accessed, and the name c is a pointer to a 
function that creates a fresh copy of the object when it is cloned. 

To encode a denomination, we encapsulate the record R{p, s, x, c) in a recur- 
sive definition that linearly manages a declaration of the name p. We define a 
notation for this definition. 

Fobj(p,i,c) = def s = (Ay)(p R(p, s, y, c)) in (p R(p,s,i,c)) 

We denote (p ^ {h = (\Xi)Pi'^^"^ }) the process that we obtain by binding 
the name c to the function that clones the object, and the names in x to the 
functions (Xx\)Pi , . . . , (Xxn)Pn- 



R{p,s,x,c) = 



{P 



{k = iXx,)P^^^■■^}) = 



/def c = {Xx)(iyq){fobj{q,x,c) \ reply(y))' 
in def Ui = (Xxi)Pi , . . . = (Xxn)Pn 

y in Fobj(p, u, c) 



Intuitively, the process (p <-h D) , where D ^ {k = {Xxi)P/^^ "^}, can be 
divided into two components. An active part, the declaration (p 4= R(p, s, x, c)) , 
which can interact with other processes in parallel. A passive part, the recursive 
definitions on the names s, x and c, which are used to memorize the internal 
state of the cell and to (linearly) recreate its active part each time the name 
p is invoked. Indeed, when (p <-n D) interacts with the name p, the unique 
declaration on p is consumed and a unique output on the restricted name s, 
acting like a lock, is freed, which, in turn, frees a single declaration on p. Using 
the derived operator {p ^ D), we give a very simple and direct interpretation 
of conc^. 

Translation rnles 

|(p ^ {k = q{x.)b/^^-^})l ^{p^{k = 

luj = reply (u) 

|ud] = {u-geti) 

{u-l ^ g{x)bj = {u -puti (Ax)|6]) 
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|clone(w)] = (u- clone) 

|let a: = a in 6] = (set x = |a] in [6]) 



l{np)aj = {np)laj 



We can simplify this interpretation a step further by defining three shorthand 
for method select, method update and for cloning. 

{P 1) = {P ■ geti) {P-l^ {\x)Q) = {P -puti (\x)Q) clone (P) = (P • done) 

With these notations we can consider that conc^ is directly embedded in 
the Blue calculus. More interestingly, we embed a higher-order version of the 
object calculus, and it is possible to define terms that are not in conc^, like 
clone(P I Q) for example, or the selector function (Xx){x ^ 1). We can also 
derive a set of reduction sequences that simulate reduction in conci;. Assume D 
is the association {k = } and j G l..n then: 

{p ^ D) \ p^lj A {p ^ D) \ Pj{xj^p} 

More formally, we prove that there is an operational correspondence between 
conc^ and the Blue calculus. To state this result, we use an observational equiv- 
alence between 7r*-terms defined in [7], denoted «, that is a variant of weak 
barbed congruence [20]. Informally, this relation is the largest bisimulation that 
preserves simple observations called barbs and that is a congruence. 

Theorem 3.1. If a = b, then |a] = |6]. If a ^ a', then |a] |a'J. If 

|a] ^ P, then there exists a conc<;-term, a' , sueh that a ^ a! and P |a'J. 

4 Type System 

We define a first-order type system for tt*, inspired by the (Curry-style) simply 
typed A-calculus. It is essentially the type system given in [5], extended with 
subtyping, record types, recursion and a special type constructor for continua- 
tions, Reply {.). Then, we establish a set of derived typing rules for the object 
notations introduced in the previous section, that simulate the typing rules of 
conc^. 

We assume the existence of a denumerable set of type variables ranged over 
by a, /3, . . . The syntax of type expressions is given by the following grammar. 

T,‘d,g ::= a \ {t ^ d) \ {pa.r) \ Top \ simple types 

[ ] I [q, ^ ■ t] I Reply {t) rows & continuation type 

We consider that types are well formed with respect to a simple binding 
system, described in the extended version of this paper [8]. Informally, the kind 
system is used to constrain the type g in the row [ p , I : t ] and, for example, to 
rule out types such as [ (-d ^ ^p) , I : r ] . 
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In our system, a type environment is an association between names and types, 
and also between type variables and kinds: F ::= 0 | F,a : t \ F,a :: k. The 
type system is based on four judgments: (1) -T h o, and (2) T h r :: k, for well 
formed environment and types; (3) T h r <: r?, and (4) T h P : t, given P, type 
r is a subtype of and term P has type t. 

The type constructors are all borrowed from type systems for functional 
languages, apart from Reply {.) that is used to type the continuation operator 
reply(P) (and linear application) and that is described later. The type Top 
is the maximal type with respect to the subtyping relation. We make a non- 
standard use of this type constant: Top is used to type terms that may not be 
expected to return results, for example resources. 



P h P : r (u : t) € P 
r \- {u P) : Top 



(Proc Decl) 



P: Top P h Q : r 
P^{P\Q):t 



(Proc Par) 



Subtyping. We do not give the details of the subtyping rules here. The rules for 
the functional part of the system are standard. For example arrow types (r ^ ■&) 
are contravariant in the first parameter and covariant in the second. The sub- 
typing rules for rows are less classical, and reflect the incremental construction 
of records. Provided the rows are well formed, we have the following subtyping 
rules, together with rules that allow identifying rows up-to reordering of their 
components. 



P \- g <: g' P \- t <:t' 

P h [p, / : t] <: [ ] P\-[g,l:T]<:[g' 

Typing rules. The typing rules for the functional part of the calculus are those 
of the simply typed A-calculus extended with records and subtyping. The typing 
rules for the 7r-calculus operators are the rules (Proc Par) and (Proc Decl) defined 
previously. In particular, the type of a parallel composition, P | Q, is the type 
of the main thread of computation, which is Q. The typing rule for declarations, 
(Proc Decl), deserves more comment. Suppose that P \- P : t), with (u : r) G P, 
and that u appears in subject position of a declaration {u <= Q), for example 
P = ((u Q) I P). Since we may substitute Q for an occurrence of u in 
R, see (2.2), the term Q must have the type t. Using rule (Proc Decl), it is 
easy to derive typing rules for definitions and higher-order application, that are 
equivalent to those found in the ML type system. 

P, u : T \- P : T P, u : T \- Q : a P 'r P : t ^ P 'r Q t 

P h def u = PuiQ : a P \- [P Q) : d 

Typing continuations. We explain the typing and subtyping rules for the 
operator Reply{.). Recall that reply(a) stands for (Ar)(ra). Let a be a fresh type 
variable. It can be proved that if P has type t, then (Ar)(r P) has type (r^a)^a. 
In (r^a) ^a, the variable a is implicitly quantified, in the sense that reply (P) 
can be given the type Va.((r ^ a) — a) in the ML type system. The type (t — 
a) ^ a is often found in typed continuation passing style transformations, and in 
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A-calculi with exceptions, where it is sometimes denoted [15]. To avoid 

the introduction of quantified types, we use a new operator to type the term 
reply(P), together with the introduction rule: 

r h P : r 

P h reply (P) : Reply (t) 

We can compare our usage of reply(.) with the usage of the operator let in 
ML, that can be defined as syntactic sugar for the term (Xx.M) N, but that is 
used in the type system to introduce parametric polymorphism. 

It is possible to validate the two following rules using the interpretation of 
Reply {t) as the type {t ^ a) ^ a (for some fresh type variable a). 

P h P : Reply^r) F, x : t Q : 'd F [- t <:'& 

F h seta; = P inQ : P h Reply (t) <: Reply{'d) 

The presence of Reply {.) is only a minor extension to the traditional type 
system of tt*, and it does not modify its interesting properties. In particular, we 
prove that reduction preserves type judgments. 

Theorem 4.1. If F h P : t and P P' , then F \- P' : t . 

Typing objects. We prove a typed correspondence between conc^ and the Blue 
calculus. To simplify our presentation, we first define a special notation for the 
type of a denomination. Apart from the use of the operator Reply{.), this type 
is analogous to the one obtained in the encoding of Abadi-Cardelli functional 
object calculus given by Viswanathan [25] and Sangiorgi [23]. 

geti- : -di, 

Obj(a.[ Zi : ]) = p,a. put^ : {a di) Reply{a), 

clone : Reply {a) 

We prove that the object type, Obj(a. 0 ), is the type of the name p in (p ^ 
D). In Obj(a.p), the variable a is called the self-type. We simply write Obj(p) 
if the self-type does not appear free in g. We can give derived typing rules for 
the objects constructs defined in Sect. 3. 

Derived typing rules for the embedding of objects 

Assume A is the type Obj(a.[^i : -d]^^' "]). 

(p : A) € r Vi € l..n F,Xi:AhPj: T^i{a^A} 

Th (p ^ = : Top 



F \- P : A j G l..n F,x : A\- Q : 

F h {P-lj ^ {\x)Q) : Reply (A) 



(Proc Updt) 



r h P : A 

P h clone (P) : Reply (A) 



(Proc Clone) 



P h P : A j € l..n 
PhP-^lj :r9j{a^A} 



(Proc Invk) 
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We give only the derivation for method update. Recall that {P-lj ^ i^x)Q) 
denotes the term (P -puti. (\x)Q). Let A denotes the type Oh}{a.[k : 

Suppose that j is in l..n, that F \- P : A, and that P,x : A \- Q : §j{a^A}. 
It follows that P h (\x)Q : {a i)j){a^A}, and that P h (P -puti.) : ((a ^ 
’dj) Reply {a)) {a^ A}. Hence P h {P-lj ^ i^x)Q) : Reply{A). Note that it is 
impossible to extend the object P with a new method, since the set {puti.)j^i,,n 
of fields is fixed. Moreover, like in Abadi-Cardelli calculus of first-order objects, 
it is impossible to refine the type of the updated method. Indeed, the type -dj 
appears in contravariant position in field puti. , and in covariant position in field 
geti.. For the same reason, we can prove that object types are not covariant, 
that is g <: a does not imply Obj(p) <: Obj(cr). However, we prove that width 
subtyping between object interfaces is sound. 

Lemma 4.1. Obj([/* : ]) Obj([l* : . 

The type system of conc^ is based on Abadi-Cardelli first-order object cal- 
culus, Obic, extended with new type constants for expressions, processes and 
synchronization. In this system, a clear distinction is made between expressions, 
that is terms expected to return results, and proeesses, that intuitively represent 
stores of expressions. Then, the type system is used for two different goals. First, 
to guarantee that terms are well formed and that a name cannot be associated 
to two different denominations. Second, to avoid runtime errors, which are in- 
stances of the so-called “message not understood” problem. In this paper, we 
study a version that only guarantee safety of executions, but it is not difficult to 
extend our type system to accommodate the first requirement, as in type system 
ensuring the “unique receiver” property in tt [3]. 

As in Obic, the basic type constructor is [U the type of objects 

with methods returning results of types " respectively. There is 

also a constant, Proc, used to type processes, like denominations for example. 
The type system is based on a subtyping relation, E \- A <-. B, such that Proc 
is the maximal type and that [U : c : A/^^' "]. 



[[/, : 4 Ohiiik : Reply iProc] 4 Top 

lE,x:Aj^lElx:lAj |01 4 0 



Theorem 4.2. The interpretation preserves subtyping judgments: if E h A<:B, 
then |i?] h |A] <: |i?]. The interpretation preserves typing judgments: if E h 
a : Proc, then |£i] h |a] : Top. If E h a : A and A Y Pxoc, then |£i] h |a] : 
Reply {{A}). 

In fact, we can prove a more general result than Theorem 4.2 since the 
type system for conc^ does not have recursive types, or self types, while our 
interpretation can capture such notions. 

There is another modification to conc^ inspired by our interpretation. It 
consists in separating the two distinct roles of process and maximal type, that is 
to consider two different constants. Top and Proc, such that Top is the maximal 
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type and that Proc is used to type denominations. These two roles are collapsed 
in conc^, as well as in the variant of tt* defined in this paper. This emphasizes 
the fact that, in a parallel composition (a b), the value of a can never be 
communicated to the outside world, and thus only its side effects are observable. 
The fact that the value returned by a term is lost is not an example of a “runtime 
error”, but we can consider that it is a programming mistake and, with our 
proposed modification, we can statically catch these mistakes. 



5 Two Applications of our Interpretation 



A first application is the interpretation of synchronization primitives. Although 
conc^ is a concurrent calculus, in the sense that multiple threads of computa- 
tion can interact in parallel, it is not obvious how to synchronize these threads. 
The approach taken in [13] is to extend conc^ with operators for mutexes, that 
are defined as special kinds of denominations. The Blue calculus has a natural 
notion of synchronization based on asynchronous communication, exactly like in 
TT. Therefore, it is not surprising that our interpretation can be easily extended 
to model mutexes. What is more interesting is that our interpretation is also 
sound with respect to the typing rules for mutexes given in [13], and that mu- 
texes appear (again) as a special kind of linearly defined resources. 



A second application, that is the most original part of this work, is to prove 
equational laws between objects using barbed congruence between 7r*-terms and 
our encoding. Let « be the weak barbed congruence relation used in Theo- 
rem 3.1. We can use our interpretation to prove that two conc^-terms are equiv- 
alent, by showing that their translations are equivalent. For example, if p is not 
free in d, we prove the following rules (among others): 

l(vp){{p ^ d)f clone(p))] l(i'p){{p ^ d)f p)] 



(lyp) 



{p i-» d)f 

let X = {p-l ^ i(y)b) inx-l 



(lyp) 



{p d)f 

lety = (p-l^‘;{y)b) inb{y<-p} 



The first rule can be viewed as a concurrent version of an equational law 
proved for the imperative ^-calculus in [14], namely (let a: = o in clone (a;)) « o, 
where o is the object (i'p){{p i— > d)t p), and p is not free in d. 

An interesting fact is that the proofs of these equalities are very simple. 
Indeed, we only need to use “well-known” algebraic laws already proved for tt* [7], 
like relation (5.1) below, similar to the replication theorem found in [19]. 



def a; = i? in (P | Q) « (defa; = PinP) | (defa: = PinQ) (51) 



Another interesting fact is that the algebraic laws obtained on conc^ are still 
valid if one extends this calculus with new primitives that can be encoded in tt*, 
such as mutexes for example. Therefore, it is not necessary to modify the proof 
system each time the object calculus is extended. 
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As an example, we sketch the proof of the first equality. We use the notation of 
Sect. 3. Let Ep[.] be the context such that |(p d)J = Ep[(p 4= R(p, s, u, c))]. 



l(iyp){{p ^ d)f- clone(p))] = {iyp)Ep[{p R{p,s,u,c)) \ p- clone] (1) 
« (i^p)Ep[R(p, s, u,c) -clone] (2) 

« inp)Ep[su I cu] (3) 

^ {vp)Ep[su I {{Xx)ivq){Fobj{q,x,c) \ reply(g))){(] (4) 

« (vp)Ep[su I (j/g)(Fobj(g,u,c) I reply(g))] (5) 

« iiyp){Ep[su] I (lyq)iliq d)] | reply(q))) (6) 

w ivp){l{p ^ (i)] I ii^q){liq 1 -^ d)l \ reply(q))) (7) 

« (i'q){l{q ^ d)l I |(7l) (8) 



Step (1) uses an instance of the law: {vu){{u ^ P) | tt) « [vu)P, and step (3) 
uses an instance of: (def x = R'mx) ~ (def a; = i? ini?). In step (2) and (4), we 
use the fact that selection and /i-reduction are deterministic reduction steps. 
For example we prove that {(\x)P)a ~ P{a;^a}. In step (5), we use (5.1) to 
distribute the definitions of Ep[.] over parallel composition, and in step (6) we 
use an intermediary result: (t^p)Ep[sil] « (i'p)\{p i— > d)}, that is implied by the 
laws used in step (3) and (4). In step (7), we use a “garbage collection” law 
similar to the following law: {(vu){u ^ P)) \ Q ~ Q. 

6 Conclusion and Related Work 

We have shown how to derive reduction and type judgments of conc^ in the 
Blue calculus in a rather simple and natural way. In our encoding, we model ob- 
jects as a particular kind of declarations, (p <-h D), that are “linearly managed”. 
It is interesting to compare these declarations with the consumable declarations, 
{u P), used to model processes (of the 7r-calculus) , and with the repli- 
cated and immutable declarations, {u — P), used to model functions (of the 
A-cal cuius). 

Many theoretical studies address the problem of modeling object-oriented 
languages in procedural languages, but few of them have succeeded to preserve 
powerful features such as subtyping. In [2], the authors propose a compositional 
interpretation of a typed (sequential) object calculus with subtyping into F<^, 
a A-calculus with second-order polymorphic types. Viswanathan improved this 
result in [25], where he gives a fully abstract interpretation in a first-order A- 
calculus with reference cells and records. In both solutions, the encoding relies on 
the so-called split method. Fisher and Mitchell [11] proposed another interesting 
typed object calculus. However, none of those calculi can model concurrent and 
interactive objects. 

In [23], Sangiorgi gives the first interpretation of Abadi-Cardelli typed func- 
tional calculus with subtyping in tt (see also [17]). This interpretation is ex- 
tended to the imperative case in [18,21]. These interpretations, and the type 
system used, are very different from ours. For example, in the coding of method 
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update, we do not use relay constructs. Intuitively, in our encoding, the num- 
ber of reductions when invoking a method does not depend on the number of 
method updates applied on the object. Therefore, if these encoding were used 
to implement concurrent objects, we would provide a more efficient implemen- 
tation of method invocation. Another major difference is that, in the proof of 
the operational correctness property, we do not use a typed bisimulation. 

There are also other formalisms used to model concurrent objects, mainly 
based on the 7r-calculus, such as [9, 12, 16, 22, 24], that are not considered in this 
paper. 

We can compare our work with the proposal of [25], where the author gives 
a syntax-oriented interpretation of a typed object calculus. Our approach brings 
the same benefits as his. In particular, our interpretation defines a type-safe 
way of implementing higher-order concurrent objects in the Blue calculus, and 
therefore in tt. 

A benefit of our encoding is that we validate some possible extensions of 
conc^, like the extension of the type system with recursive types and self- types, 
or the extension with a maximal types, say Top, that differs from the type 
given to processes. Another interesting extension considered in this paper is the 
addition of functions and higher-order constructs to conc^. Indeed, functions can 
be coded in Abadi-Cardelli object calculus, but to simulate the types of functions 
in a satisfactory way, they need to use universally and existentially quantified 
types to the detriment of type inference [1]. With our approach, we propose 
a natural extension of the object calculus with functions without noticeably 
modifying the definition of the equivalence or the type system, nor the interesting 
equational laws. Another benefit is the study of equational laws between objects. 
We give an example of such equational laws at the end of Section 5. It would be 
interesting to study the equivalence obtained on conc<; using barbed congruence 
and our encoding. 
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Abstract. We present a formalization of a typed vr-calculus in the Cal- 
culus of Inductive Constructions. We give the rules for type-checking and 
for evaluation and formalize a proof of type preservation in the Coq sy- 
stem. The encoding of the 7r-calculus in Coq uses Coq functions to repre- 
sent bindings of variables. This kind of encoding is called a higher-order 
specification. It provides a concise description of the calculus, leading to 
simple proofs. The specification we propose for the pi-calculus formalizes 
communication by means of function application. 



1 Introduction 



The TT-calculus 



is a model of concurrent computation based upon 
the notion of naming. Processes receive and emit names, which denote channels. 

We propose here new formalizations of both evaluation and typing rules for 
a typed 7r-calculus, with the aim to enable machine-checked proofs of various 
properties of languages based on this calculus. As a simple, typical illustration 
of such a proof, we give a proof of preservation of types, also called the subject- 
reduction property, for the calculus. 

For our experiment, we chose the Coq system |BBC+97iHKPlVl!T7) . based on 
the Calculus of Inductive Constructions [CH88IPM92| . In this system, proofs are 
performed by applying tactics, in a goal-directed manner. We like this system 
because of its rich meta-language (a higher-order functional language, allowing 
full dependent inductive types), and because it is based on Type Theory. 

Thanks to the use of the higher-order abstraet sj/ntaa; technique, giving rise to 
so-called higher- order specifications, our formalization of the 7r-calculus does not 
require definitions for free or bound variables in a term. Nor does it require defi- 
nitions of notions of substitutions, which are implemented using the meta-level 
application, i.e. application available in the Logical Framework used to imple- 
ment our calculus (which in our case is the Calculus of Inductive Constructions) . 

Robin Milner [MiM| introduced notions of abstractions and concretions lea- 
ding to a presentation of the calculus in which input processes evaluate to fun- 
ctions and output processes evaluate to concretions, i.e. (intuitively) pairs of a 
value and a process. Our formalization starts from this idea, replacing concreti- 
ons by higher-order functions, thus leading to a nicer, more uniform presentation, 
in which communication is formalized by function application. 



J. van Leeuwen et al. (Eds.): IFIP TCS2000, LNCS 1872, pp. 425-^^^ 2000. 
@ Springer- Verlag Berlin Heidelberg 2000 
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Related works include formalizations of various 7r-calculi in the same system. 
These works were conducted with different goals in mind. Some of them had the 
same goals as ours. The others aimed at the study of bisimulation techniques. 
We shall discuss both kinds of related works in detail at the end of the paper. 

Note on the Coq syntax. Terms in Coq are terms of a typed A-calculus, 
where the abstraction Ax : t.a is written [x:t]a, the dependent product type 
Vx : t.a is written (x:t)a, while the non-dependent product type A ^ B is 
simply written A ^ B. The application of a function / to x is written (/ x). 
The expression 3x : t.a is written (Ex [x:t]a). The type of sets is denoted as Set 
while the type of propositions in denoted as Prop. As usual in Type Theory, the 
type Ai —>■••• —7> An —>■ B where R is a proposition represents (through the 
Curry-Howard isomorphism) the inference rule with premisses Ai , • • • , A„ and 
conclusion B. Inductive types are defined by a list of constructors together with 
their types, as for example for the naturals: 

Inductive nat : Set := 0 : nat I S : nat -> nat . 

An inductive definition automatically generates both an inductive principle for 
the type being defined and elimination principles which enable the definition of 
recursive functions on this type. In the following, Coq text will be given as above 
in a typewriter font. 

2 Syntax 

The terms of the 7r-calculus are processes, which receive and emit names denoting 
channels. For the sake of simplicity, we do not consider here the operators of 
choice and matching. It will be straightforward to add them. 

2.1 The Monadic tt- C alculus 

Processes are defined by the following rule: 

Processes : P,Q ::= xy.P \ x{y).P \ vx : t.P | (P | Q) | ! P | 0 

in which x and y belong to an enumerable set of names and t denotes the type 
of a name as described in section El 

The expression xy.P denotes a process which sends a name y along the 
channel x (output operation), then behaves as P. The term x(y).P denotes a 
process which receives a name z along the channel x (input operation), then 
behaving like P[z/y] (P in which z replaces y). The expression (P | Q) denotes 
the parallel composition of P and Q where P and Q are concurrently active. 0 
denotes the empty process. ! P denotes the infinite process P | P | • ■ ■ ; the ! 
operator (pronounce “bang”) is called the replication operator. Finally, vx : t.P 
declares x as a private channel with type t in P. The v operator is called a 
restriction as it restricts the use of the name x to P. 
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In the term ux : t.P, the name x is bound in the term P. Similarly, in the 
term x{y).P, the name y is bound in the term P, waiting to be substituted by 
another name before P can be executed. 

Processes evaluate to other processes while performing actions. In the stan- 
dard approach, there are four kinds of actions: input actions xy (input a value 
for variable y on a channel x), output actions xy (output value y on a channel 
x), bounded output action x{y : t), and silent actions r. Communications are 
always silent. 



Actions : A ::= xy \ xy \ x{y '■ t) \ t 

An original, and central notion of the 7r-calculus is the notion of scope extru- 
sion. A name is said to be extruded when it is a private name sent to a process. In 
such a case, the scope of the name is extended to the new process which receives 
it. This phenomenon, which may make the notion of binding in the 7r-calculus 
unclear at first glance, will be illustrated and discussed later on. 

The informal account of terms requires the definition of the set of variables in 
a term, that we denote as Var{t). It also requires the definition of free variables 
{F{t)) and bound variables {B{t)) in a term. In the informal presentation of the 
calculus, we will need a notion of substitution of a variable for another variable 
in a term, written as usual as P{y/x}. 

2.2 Formalization of the Syntax 

We encode the terms of the calculus in Coq as follows. First we introduce a new 
parameter name, and suppose two usual properties on the set of names, that we 
state as axioms: 

Vx, y : name, x = y V x ^ y; Vx : name.By : name, x ^ y 

Parameter namie : Set . 

Axiom name_decidable ; (x,y:name) (x=y) \/ (~x=y) . 

Axiom name_dif f erent : (x:name)(Ex [y:name] (~x=y)) . 

Then we introduce the set of types as a parameter too, stating some pro- 
perties on this set later on. 

Parameter typ : Set . 

The processes are then defined as follows: 



Inductive 

Nil 


proc 

proc 


Set : = 


1 


In 


name 


-> 


(nemie -> proc) -> proc 


1 


Out 


name 


-> 


naune -> proc -> proc 


1 


Par 


proc 


-> 


proc -> proc 


1 


Bang 


proc 


-> 


proc 


1 


Res 


typ 


-> 


(narnie -> proc) -> proc 
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The representation of processes requires some explanations. There are two 
binding operators in the 7r-calculus: the input and the restriction operator. 
We use here Coq functions to represent bindings of variables; hence the type 
{name — >■ proc) in the types of input processes and restrictions. This kind of en- 
coding is called a higher-order specification. This method of formalization, called 
the higher-order abstract syntax technique, provides an elegant way to formalize 
the binding nature of the constructors. In particular, it provides for free a defi- 
nition of terms up to a- equivalence. This method also provides an elegant way 
to formalize substitutions, as we shall see later on, when formalizing the rules 
for evaluation. 

Note that the type name is declared as a parameter, not as a variable. This 
prevents name to be instantiated by an inductive set. 

A restricted name never gets instantiated. Nevertheless, using functions (i.e. 
binding operations) to represent the restriction yields a simple and elegant for- 
malization of the notion of private names. 

In our description of the 7r-calculus, the only visible argument of an action 
will be the name of a channel. In other words, actions only need to carry a name. 
As a consequence, there is no need for bound output actions in our development: 

Inductive action : Set := 

InA : narnie -> action I OutA : name -> action I Tau : action. 

3 Simple Examples of Evaluation 

We give here some simple examples of communication. The simplest case of 
communication is the following one, where a name y is sent to Q, along a channel 
x: 



xy.P I x{z).Q 4 P I Q{y/z} 

Then let us illustrate the phenomenon of scope extrusion. This occurs in a 
communication where a process sends a private name to an external process, as 
in: 



ny : t.{xy.P) \ x{z).Q which reduces to vy : t.{P \ Q[y/z\). 

The private name y has been passed to the external process Q. We say that P 
has extruded the scope of the private channel y. We have a similar phenomenon 
in imperative languages when a local name y is sent to a procedure Q taking its 
parameters by reference. 



4 Evaluation Rules 

The evaluation rules define a judgement P Q meaning that P evaluates to Q 
producing an action a. The chosen semantics is the early transition semantics. 
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4.1 Informal Evaluation Rules 

We need here the definition of all variables in a term (Var{t)), together with 
the definitions of fresh and bound variables (F{t) and B{t)). Moreover, usual 
informal presentations of the calculus use the open and close rules, which open 
and close the scope of a name, in order to describe the phenomenon of scope 
extrusion that we illustrated above (section 0 . The informal description of the 
evaluation rules is the following one. 



input : 
output : 



x{y).P ^ P{z/y} 
xy.P ^ P 

pPip' QPiQ' pPip' Q^Q' 



open : 



close : 



P\Q^P'\Q' 

P^P' 



: t.P "^4*^ P’ 



Q\P^Q'\P' 

xj^y 



vy 



pPjp' Q ^ 4 *^ Q' 

P\Q^uy. t.{P' I Q') 



y i F{P) 



P P' 



vx : t.P — >■ ux : t.P' 



pPjp' Q ^ 4 *^ Q' 

Q \P^yy.t.{Q' \ P') 
X ^ Var{a) 



y i F{P) 



par : 



bang : 



p ^ p> 



p I g A P' I Q 



B{a) n P(Q) = 0 



P ^ P' 



g I p 4 g I P' 

!P I p 4 p' 

\ p jf. pi 



p(a)np(g) = 0 



In the above set of rules, rules for input and output describe the basic cases of 
input and output processes. The com and close rules describe communication, 
while the rest of the rules describe the propagation of evaluation through a 
restriction, a parallel composition, or a replication. 

The replication rule could be simpler. The advantage of the chosen rule is 
that it enables the extension of the rules with a choice operator. 



4.2 Formalization of the Evaluation Rules 

We shall take full advantage here from our choice to use functions in the forma- 
lization of the syntax. 

First, as input processes are described as functions, we just say that input 
processes evaluate to themselves. This follows Robin Milner approach fTOTiTI 
where he introduced a notion of abstraction to denote the result of the evaluation 
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of input processes. We will define an inductive predicate evali for processes 
performing an input action: 

Inductive evali : proc -> action -> (name -> proc) -> Prop 

with as first and simplest case the case for input processes: 

evali_in : (x:name) (p:name->proc) (evali (In x p) (InA x) p) 

Robin Milner also introduced a notion of concretion jMil01| . which (intuiti- 
vely) was a pair of a value and a process, to denote the result of the evaluation 
of output processes. Then he (informally) described the communication as the 
“application” of an abstraction to a concretion. We thought it would be nice, 
in order to formalize the communication by a real application, to describe the 
evaluation of an output process as yielding a higher-order function which will 
then be applied to the result of the evaluation of an input function in order to 
describe a communication, as we shall see later on. 

Let us say that the evaluation of output processes yields higher-order fun- 
ctions taking functions of type (name — >■ proc) as arguments. We define an 
inductive predicate evalo for processes performing an output action: 

Inductive evalo : proc -> action -> ( (name->proc) -> proc) -> Prop 

with as first and simplest case the case for output processes: 

evalo_out : (x,y:name) (p:proc) 

(evalo (Out x y p) (DutA x) [f :nEmie->proc] (Par (f y) p)) 

Example We are now in a position to see how we can describe communication 
on a basic example: 



pfdjp' Q^Q' 
db.P I a{y).Q 4 P' | Q'{b/y} 

The evaluation of the input process a{y).Q gives the function Q itself. Then 
the result of the evaluation of the output db.P is a function A/ : name — >■ 
proc.{{f b) I P). Applying this function to Q yields the expected result: {Q b) \ P. 

(Out a b P) — (DutA a) — > [f :name->proc] ( (f b) I P) = P’ 
(In a Q) — (In a) — > Q 



(Out a b P) I (In a Q) — Tau— > (P’ Q) = (Q b) I P 

Thus the evaluation of both input and output processes yields functions. 
Communication is formalized by applying the result of the evaluation of the 
output process to the result of the evaluation of the input process. Our for- 
malization has replaced concretions by higher-order functions, thus leading to 
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a nicer, more uniform presentation, in which communication is formalized by 
a real application of a function. On a technical level, the higher-order abstract 
syntax method enables us to describe substitution in evaluation rules by simply 
using the application of a Coq function to terms -names in this case. 

It is staightforward to prove (on paper) the consistency of this presentation 
of the semantics of the 7r-calculus with functions with respect to the standard 
approach. 

The rule for example for left communication will be formalized as follows, in 
a third set of inductive rules describing the evaluation of processes that perform 
silent actions: 

Inductive eval : proc -> action -> proc -> Prop := 

I eval_coml : 

(p:proc) (x:name) (p’ : (name->proc) ->proc) 

(evalo p (OutA x) p’) -> 

(q:proc) (q’ :name->proc) (evali q (InA x) q’) -> 

(eval (Par p q) Tau (p’ q’)) 

[...]. 

Note that communication is always simply formalized by an application, no 
matter whether names are extruded or not during the communication. This is 
the reason why we do not need rules for opening and closing the scope of names 
in our formalization. 

The other rules -for the parallel, replication and restriction operators- need to 
be duplicated, one for each type of action involved (input, output or silent). Note 
however that all proofs, either formal or informal, proceed by a case analysis on 
the type of the action involved anyway. Thus this apparent defect should not be 
considered as a drawback of our formalization. 

Thanks to the use of the higher-order abstract syntax technique, when eva- 
luating a process which declares a private name, we do not need to check that 
this private name is a fresh name. The freshness of the name naturally comes 
from the use of a universal quantification on names in the premiss of the rule. 

eval_res : 

(u:typ) (p :name->proc) (a:action) (q:name->proc) 

((x:name) (eval (p x) a (q x))) -> 

(eval (Res u p) a (Res u q) ) . 

The rules for parallel composition do not require any side condition regarding 
bound variables either. This is because the actions do not carry values here, but 
only the name of a channel. 

eval_parl : (p:proc) (a:action) (p’ :proc) 

(eval pap’) -> (q:proc) (eval (Par p q) a (Par p’ q)) 

The rule for replication (omitted here) is straightforward. 
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5 Type Checking Rules 

The type system that we chosed uses a notion of directionality of names Esnni, 
which is their ability to be used as emitters, receivers or both (or none). This 
name usage discipline is implemented by type-checking rules. This type system, 
while being simple, is at the same time widely used as witnessed by the experi- 
ment of Piet IPT117I and non-trivial because of the sub- typing that it captures. 
For our purpose, it is more interesting than the type system for the polyadic 
TT-calculus iVLiiyil ■ whose typing rules are totally straightforward. 

5.1 Types 

A type f of a name of a channel is the union of its capacity^ which represents 
how the channel can be used (read, write, both or none) and the types of the 
names that can transit in it: 

Capacities : c ::= read \ write \ both \ none 
Types: t ::= {c,t) 

We formalize the capacities of names as follows, where none is called top (for 
top of the mini-lattice): 

Inductive cap : Set := 

read : cap I write : cap I both : cap I top : cap. 

Remember we only defined types as: 

Parameter typ : Set . 

This means that we do not impose a particular implementation of types. 
Instead we state some properties this implementation must enjoy. First we must 
be able to extract both the capacity (t) and the type [t] parts of a type t: 

Parameter get_cap : typ -> cap. 

Parameter get_typ : typ -> typ. 

Then we have a subtyping relation on capacities, denoted as < and defined 
as follows: 



c < c I both < c I c < none 

The subtyping relation on capacities induces a subtyping relation on types, 
that we still denote as < and define as follows: 

t < t 

{t') = none t < t' 

(t) < read A (F) = read A [t] < [f] t < t' 

{t) < write A (F) = write A \t'] < [t] ^ t <t' 

The < predicates are formalized by the following inductive types, the obvious 
rules of which we omit here: 
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Inductive stype_cap ; cap -> cap -> Prop 
Inductive stype : typ -> typ -> Prop 

The covariance of the read capability and the contravariance of the write 
capability introduce a subtyping in depth, as the typing rules will show. 

A process is said to be well typed in a particular environment which gives 
types to each of its free name. In contrary to the usual case in typed programming 
languages, the type-checking rules do not give types to terms in the 7r-calculus. 

5.2 Examples 

The process xy \ x(u).uv \ xz is well typed in an environment where x has the 
capabilities both (both read and write), xy alone is well typed in an environment 
where x has a capabilities c < write, which means that c is either both or write. 

The following example shows how certain interferences between processes 
can be eliminated. Consider the process P = uy : {both,t).{xy.y{z)), where x 
has type (write, (write, t)). The process P gives the writing capacity to the 
outside, while keeping the reading capacity for himself on the name y. After the 
transmission of y, a process sending a value to P on channel x is sure that this 
value cannot be captured by another process. 

The previous example also illustrates the use of subtyping in depth. 



5.3 Informal Typing Rules 

The informal typing rules, defining a judgement env h proc, are self-explanatory. 
(We omit the obvious rules for par and bang.) Note the need for the Var predi- 
cate in the input and res rules. 

T h 0 

{P(x)) < read 'iz ^ Var(P) P.(z : [T(x)]) h P{z/y} 
rhx(y).P 

{r(x)) < write P(y) < [T(a:)] P \- P 
P h xy.P 

My ^ Var(P) P.(y : t) h P{y/x} 

P \- vx ■. t.P 



typjnil : 
typJnput : 

typ-output : 

typjres : 



5.4 Formalization of the Typing Rules 

An environment is encoded as a function from names to types: 



Definition env : Type := name -> typ. 
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Then the type-checking rules are a straightforward translation from infor- 
mal rules, where the only interesting point to note is that we do not need to 
look for fresh names in the rules for input and restriction. The freshness condi- 
tion is implemented by an universal quantification on variables in the premisses 
involved. 

The complete set of rules is as follows: 

Inductive type : env -> proc -> Prop := 
type_nil : (g:env) (type g Nil) 

I type_in : 

(g:env) (x:name) (p:name->proc) 

(stype_cap (get_cap (g x) ) read) -> 

((y:name) ( (g y)=(get_typ (g x))) -> (type g (p y) ) ) -> 
(type g (In x p)) 

I type_out : 

(g:env) (x,y:name) (p:proc) 

(stype_cap (get_cap (g x) ) write) -> 

(stype (g y) (get_typ (g x))) -> 

(type g p) -> 

(type g (Out X y p)) 

I type_res : 

(g : env) (t : typ) (p : name->proc) 

((y:name) ( (g y)=t) -> (type g (p y) ) ) -> 

(type g (Res t p)). 



6 Preservation of Types 

We have to prove three theorems, one for each kind of actions. The theorem for 
the silent actions simply states the following: 

Theorem sr_tau: 

(p,q:proc) (eval p Tau q) -> (g:env) (type g p) -> (type g q) . 

The proof of this theorem uses the following natural property for communica- 
tions. The property says that a communication between two processes resulting 
from the evaluation of well-typed processes is well-typed: 

Theorem sr_com: 

(p:proc) (x:name) (p’ : (name->proc) ->proc) 

(evalo p (OutA x) p’) -> (g:env) (type g p) -> 

(q:proc) (q’ :name->proc) (evali q (InA x) q’) -> (type g q) -> 
(type g (p’ qO) . 
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The proof of the above theorem needs the theorems stating the preservation 
of types properties for input and output actions: 

Theorem sr_in: 

(p:proc) (a:action) (q:nEmie->proc) 

(evali p a q) -> (x:name) (a=(InA x)) -> 

(g:env) (type g p) -> 

(y:name) (stype (g y) (get_typ (g x))) -> (type g (q y)). 
Theorem sr_out : 

(p:proc) (a:action) (q: (name->proc)->proc) (evalo p a q) -> 
(x:name) (a=(0utA x)) -> (g:env) (type g p) -> (f :name->proc) 
((y:name) (stype (g y) (get_typ (g x))) -> (type g (f y))) -> 
(type g (q f)) . 

Note in the above theorems how we deal with the typing of functions. We 
say that a function / from name to processes is well-typed if the process (/ y) 
is well-typed for every name y of the appropriate type. Similarly, a function F 
from (name to processes) to processes will be said to be well-typed if the process 
{F f) is well-typed for every well-typed function /. 

Except for some simple lemmas on the subtyping relation, no further lemmas 
are required in the proof. In particular there is no need for tedious proofs of 
lemmas about renaming, freshness of variables and the like, contrary to almost 
all the other developments we have seen so far. 

7 Related Work 

Related works include several recent formalizations of various 7r-calculi, all done 
in the Coq system. As we mentioned in the introduction of the paper, some 
of these works had the same goals as ours, while others aimed at the study of 
bisimulation techniques. To be fair with the experiments in the latter case, we 
must say that the use of functions in our specification of the 7r-calculus, while 
providing excellent results for our goals, might make the proofs of correctness of 
bisimulation techniques a bit harder than usual. 

An extensive implementation of the 7r-calculus, including proofs of correctness 
of bisimulation techniques, has been realized by Daniel Hirschkoff [HirhTIHirhh] 
using the de Bruijn method. Our contribution is much more modest in the sense 
that it only provides a new basis for such a study. However, the de Bruijn 
technique quickly leads to very obscure semantic descriptions and makes proofs 
long and tedious. 75 percent of the development by Daniel Hirschkoff concern 
manipulation of de Bruijn codes, which is not the subject of interest. 

A different method has recently been investigated by Guillaume Gillard 
EHOi, on a concurrent object calculus recently proposed by Andrew Gordon 
and Paul Hankin. Guillaume Gillard provides a proof of preservation of types 
for this calculus. The method used is due to Andrew Gordon ICTl. It consists 



436 J. Despeyroux 



in describing terms modulo alpha-conversion, over a predefined set of A-terms 
implemented with de Bruijn codes. The part of the development devoted to the 
manipulation of de Bruijn terms, while still sizeable, is definitely more reasonable 
than in the standard approach. 

A recent development has been realized by Loic Henry-Greard m n M]. The 
proof done is the same as ours: the proof of preservation of types for the monadic 
TT-calculus. The proof uses a (slight extension of a) method due to Randy Pollack 
and James Me Kinna, extensively used to formalize Lego in Lego. The basic idea 
of the method is to distinguish between bound variables (represented by varia- 
bles) and free variables (represented by parameters) . This gives a nice solution to 
the problem of capture-avoiding substitution. However, there is still an overhead 
in both the descriptions and the proofs, to manipulate the variables and the 
parameters. It is interesting to note that our proof, while being considerably 
shorter, has exactly the same structure as Loic Henry-Greard’s proof. 

Finally, Furio Honsell, Marino Miculan and Ivan Scagnetto have proposed a 
higher-order description of the 7r-calculus (with the match operator) |HMS9fi| . 
different from ours, which they have used in the proofs of correctness of various 
bisimilarity laws. Their formalisation uses meta-level (Goq) functions to encode 
the input and restriction operators like us; However, the rest of the presentation 
greatly differs from ours. They need to duplicate the evaluation rules in a non- 
natural way: some of the rules yield processes while the others yield functions 
from name to processes. Gommunication is formalized using open and close ru- 
les, which, despite the use of higher-order syntax, require a freshness predicate. 
Meanwhile, they provide a formal study of some bisimilarity principles, which, 
as it uses rules of evaluation (partly) described by means of functions, should a 
priori inspire a similar study using our specification. 

8 Conclusion and Future Work 

We have presented in this paper a higher-order formalization of a monadic tt- 
calculus in the Galculus of Inductive Gonstructions, and provided a proof of type 
preservation for this calculus in the Goq system. The specification we proposed 
formalizes communication by a function application. Our proof is very short: it 
is only three pages long. We believe that the same proof performed on alterna- 
tive descriptions of the same calculus would give much longer and more tedious 
proofs. Extending our development to the polyadic 7r-calculus, while requiring 
longer proofs, should be straightforward. 

It might be worth noting that the technique of higher-order abstract syntax 
has been rarely used to describe imperative languages up to now. It seems that 
this method, while being of great benefit in the descriptions of functional pro- 
gramming languages, is a bit less suitable for imperative languages. However, the 
TT-calculus is considered as an imperative language, as communication is descri- 
bed as a side-effect in the meta-theoretical studies of the calculus. We may hope 
that the specification proposed here might give new ideas for the description of 
imperative languages. 
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Another interesting point to note is that only a limited form of higher-order 
abstract syntax is required to describe first-order, usual 7r-calculi. The point is 
that we only need functions from name to processes to describe binders, while 
functions from processes to processes would be required in the description of the 
higher-order 7r-calculus, where processes themselves can be passed on a channel. 
The problem is that such types are not allowed as types of constructors of an 
inductive type for processes, at least in the standard way as the one provided by 
almost all the current systems providing induction. Consequently, we are not able 
to use the technique of higher-order abstract syntax to describe the higher-order 
TT-calculus today. However, there are already propositions for various meta-logics 
in the literature jl )PSh7ll )bh8fM MDTlHot^ which could be used as a basis for 
the implementation of the system we need. 



Concerning future work, our main goal is to provide the basis for a machine- 
checked study of languages based on the 7r-calculus. Particularly relevant here 
are proofs of properties such as the preservation of types for languages for which 
the rich type system make those proofs challenging, even in their hand-written 
presentations. We think here of languages involving polymorphism as the Piet 
Fnm or the Join IFGL+961 languages, or to type-checking rules to prevent 
dead-lock as proposed in [KHEnBI. Other interesting languages will be those 
arising from the various current propositions for concurrent and object calculi 



jsnnBin 



As we already mentioned, future work of a different nature is the study of the 
theory of the 7r-calculus: bisimilarity techniques. It would be very interesting to 
see whether the higher-order description of the 7r-calculus we propose here can 
be of real benefit too in such formal studies. 

In the long term, we hope such formal studies to be used to formalize proof 
of correctness of programs written in languages based on the 7r-calculus. 
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Abstract The SOS formats ensuring that bisimilarity is a congruence of- 
ten fail in the presence of structural axioms on the algebra of states. Dy- 
namic bisimulation, introduced to characterize the coarsest congruence 
for CCS which is also a (weak) bisimulation, reconciles the bisimilarity 
as congruence property with such axioms and with the specification of 
open ended systems, where states can be reconfigured at run-time, at 
the cost of an infinitary operation at the meta-level. We show that the 
compositional framework offered by tile logic is suitable to deal with 
structural axioms and open ended systems specifications, allowing for a 
hnitary presentation of context closure. 

Keywords: Bisimulation, SOS formats, dynamic bisimulation, tile logic. 



Introduction 

The semantics of dynamic systems can be conveniently expressed via labelled 
transition systems (lts) whose states are terms over a certain algebra and whose 
labels describe some abstract behavioral information. Provided such informa- 
tion models the possible interactions between various components, the frame- 
work yields a compositional semantics. Plotkin’s structured operational semantics 
(sos) ^ is one of the most successful such frameworks, where the transitions 
a system can perform are defined by recursion on the structure of its states. 

Several notions of equivalence on the state space of ltss have been consid- 
ered in the literature that take into account particular aspects. For example, if 
one is interested only in the action sequences performed by the system, then one 
should observe traces, whereas in a truly concurrent approach one would rather 
observe partial orderings of actions. State equivalences can then be defined on 
the basis of the chosen observables. In this paper we consider bisimulation equiv- 
alences (with bisimilarity meaning the maximal bisimulation), where the 

entire branching structure of the transition system is accounted for: two states 
are equivalent if whatever transition one can perform, the other can simulate it 
via a transition with the same observation, still ending in equivalent states. 
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One of the main advantages of a compositional semantics is that each sub- 
component of a state can be safely replaced by any equivalent subcomponent 
without affecting the overall behavior. This triggered many efforts devoted to 
the study of SOS formats whose syntactic constraints guarantee that hisimilarity 
is a congruence. Among the most popular such formats, we mention the sim- 
ple De Simone format HOI, the more general positive GSOS format |3, and the 
‘liberal’ family of tyft formats |ldll| (tyxt, zyft, promoted tyft,...). Yet, in 
many interesting cases the use of these formats is not straightforward. For in- 
stance, although it is often convenient to have some structural axioms on states, 
these cannot be handled by ordinary formats. Fall in this case several semantic 
descriptions based on the chemical abstract machine |2|, where operators are 
often assumed to be associative, commutative and with unit, as e.g., the parallel 
composition of CCS. Further examples are given by systems modeled using UML 
graphs, that must be taken up to suitable isomorphisms and therefore require 
the LTS to be defined on suitable structural equivalence classes. Also the (finite) 
TT-calculus CHI is defined by rules in good format when substitution is explicit, 
but agents are subject to substitution axioms. 

It is therefore often necessary to resort to the largest congruence included 
in the bisimilarity by an operation of closure ‘for all contexts’, which results in 
a congruence which is no longer a bisimulation. Dynamic bisimulation pm, in- 
stead, performs such a context closure during the bisimulation ‘game’. Dynamic 
bisimilarity was shown to capture the coarsest equivalence for CCS agents among 
weak bisimulations that are also congruences, being completely axiomatized by 
the axioms of strong observational equivalence plus two of the three Milner’s 
r-laws. The basic idea is to allow at every step of bisimulation not only the exe- 
cution of an action, but also the embedding of the two states under comparison 
within the same, but otherwise arbitrary, context. It is worth remarking that 
such dynamical contextual embedding has a natural interpretation in terms of 
dynamic reconfiguration of the system, and hence can find many application in 
practice for modeling open ended systems. Its main drawback is the lack of a 
convenient representation at the level of the SOS rules; it rather has the spirit of 
a ‘meta’ construction, involving a universal quantification on contexts. An in- 
teresting approach - in the style of Sewell’s work on defining LTS from reduction 
systems pm - would be to enrich the transition system with all (unary) contexts 

as labels, and all transitions p C[p] for any state p and context C[-]. This 
solution, however, would still be at the meta- level and infinitary in principle. 



In this paper we propose to recast dynamic bisimulation inside the tile model, 
where it can be finitely modeled. The tile model [El is a formalism for modular 
descriptions of the dynamic evolution of concurrent systems. It relies on a form 
of rewrite rules with side effects, called basic tiles, which are reminiscent of both 
SOS rules and eontext systems m, collecting intuitions coming from structured 
transition systems P| and rewriting logic pni. In particular, by analogy with 
rewriting logic, the tile model can be defined employing a logical presentation, 
called tile logic, where tiles are (decorated) sequents subject to inference rules. 
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Figure 1. Horizontal, parallel and vertical tile compositions. 
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Tile logic. Tile logic extends rewriting logic (in the unconditional case) by tak- 
ing into account state changes with side effects and rewriting synchronization. 
Basically, a set of rules describes the behaviour of certain (partially specified, in 
the sense that can contain variables) configurations, which may interact through 
their interfaces. Then, the behaviour of a system as a whole consists of a co- 
ordinated evolution of its sub-systems. The name ‘tile’ is due to the graphical 
representation of such rules, which have the form 

initial input interface^ ^ ^ q initial output interface 

final input interface^ ] ^ ^ final output interface 

s 



also written a : s — ^ s' , stating that the initial configuration s of the system 
evolves to the final configuration s' via the tile a, producing the effect b, which 
can be observed by the rest of the system. However, such a step is allowed 
only if the subcomponents of s (i.e., the arguments of the context s) evolve to 
the subcomponents of s' , producing the effect a, which acts as the trigger of a. 
Triggers and effects are called observations and tile vertices are called interfaces. 
The arrows s, a, b and s' give the border of a. 

Tiles are a natural model for reactive systems that are compositional in space 
- the behaviour of structured states can be defined as a coordination of the 
activities of the subcomponents - and compositional in time, as they offer a 
powerful framework for modelling the composition of computations. Indeed, tiles 
can be composed horizontally, vertically, and in parallel to generate larger steps. 
The operation of parallel composition corresponds to building concurrent steps, 
where two (or more) disjoint configurations can concurrently evolve. Of course, 
the border of a concurrent step is the parallel composition of the borders of each 
component of the step. Horizontal composition yields rewriting synchronization: 
the effect of the first tile provides the trigger for the second tile, and the resulting 
tile expresses the synchronized behaviour of both. Vertical composition models 
the execution of a sequence of steps starting from an initial configuration. It 
corresponds to sequential composition of computations. The three compositions 
are illustrated in Figured 

Given a set of basic tiles, the associated tile logic is obtained by adding some 
canonical ‘auxiliary’ tiles and then closing by composition - horizontally, verti- 
cally, and in parallel - both auxiliary and basic tiles. As an example, auxiliary 
tiles may be introduced that accommodate isomorphic transformations of inter- 
faces, yielding consistent rearrangements of configurations and observations. 
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Tile logic deals with algebraic structures on configurations that can be dif- 
ferent from the ordinary, tree-like presentation of terms, as e.g., term graphs, 
partitions, preorders, relations, net processes. In fact, all these structures give rise 
to monoidal categories and, therefore, possess the two basic operations needed 
by tile configurations - the monoidal tensor product gives the parallel composi- 
tion and the sequential composition of the category represents the application of 
a context to its arguments. In this paper we shall use (monoidal) categories in a 
very elementary way, i.e., just for representing terms and substitution diagram- 
matically as abstract arrows (from the arguments to the result) . In particular we 
shall consider neither the categorical models of tile logic (expressed by suitable 
monoidal double categories fEED, nor the axiomatized proof terms decorating 
tile sequents. Varying the algebraic structure of configurations and observations 
tiles can model many different aspects of dynamic systems, ranging from syn- 
chronization of net transitions |Z], to causal dependencies for located calculi and 
finitely branching approaches for name-passing calculi HH. to actor systems m- 
In addition, tile logic allows one to reason about terms with variables as Larsen 
and Xinxin’s context systems m. while SOS formats work for ground terms only. 

Several formats for tiles have been defined where the structure of configura- 
tions is given by the term algebra over a signature. Namely, (1) the monoidal 
tile format CZ], which has monoidal structures of both configurations and ob- 
servations; (2) the algebraic tile format which has a cartesian structure of 
configurations but only a monoidal structure of observations; and (3) the term 
tile format m, which has cartesian structures of both configurations and ob- 
servations. Although none of them ensures that tile bisimilarity is a congruence, 
by restricting these formats one can easily recover either the De Simone, or the 
positive GSOS, or the zyft format. Here, we shall focus only on the monoidal 
and term tile formats, showing that it is always possible to manage dynamic 
bisimulation via ordinary tile bisimulation by extending the vertical signature 
with a finite number of operators determined from the signature of configura- 
tions. In particular, for each operator / of the horizontal signature we shall add 
the observation / and the tile 



id 

• )• • 

i/ 

• > ■ 

f 

where id denotes identity in the appropriate category. Such a tile can then be 
applied to any configuration, embedding it in context / and producing effect 
/, which can now be observed in the ordinary bisimulation game. As we shall 
see, the congruence proof for bisimilarity in the enriched systems can be carried 
out as an abstract tile pasting. Moreover, such bisimilarity coincides with the 
dynamic bisimilarity on the original system. 

The idea of allowing contexts as observations has been at the basis of the pro- 
moted tyft/tyxt format Q, designed for dealing with higher order languages. 
We think that such an extension can be carried out in a natural way in the 
abstract framework provided by tile logic. 
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Structure of the paper. In Section Q we fix the notation, recall the most 
diffused rule formats for transition system specifications, and motivate dynamic 
bisimulation. In Section |21 we summarize the tile formats which we focus on. 
Section |3 introduces syntactical constraints on basic tiles that guarantee the 
‘bisimilarity as a congruence’ property. Section 0 presents our main results: a 
finitary presentation of dynamic bisimulation via monoidal and term tile systems. 



1 Bisimulation and SOS Formats 

The notion of bisimulation dates back to the pioneering work of David Park and 
Robin Milner on process algebras and provides the standard framework 

to express behavioural equivalences of complex dynamical systems. 

Definition 1. A labeled transition system (hxs ) is a triple L = (5', A, — >■), where 
S is a set of states, A is a set of labels, and is a ternary relation 

^CSxAxS 

We let s, t, s', t'... range over S and a, b, c, ... range over A. For (s, a, s') £ — >■ 
we use the notation s — ^ s' . 

Definition 2. For L = (S', d,— >■) a lts, a bisimulation on L is a symmetric, 
reflexive relation ^ C S x S such that if s ^ t, then for any transition s — ^ s' 
there exists a transition t — > t' with s' ~ 

We denote by ~ the largest bisimulation and call it bisimilarity, and we say 
that two states s and t are bisimilar whenever s ~ t or, equivalently, whenever 
there exists a bisimulation ~ such that s t. 

In this paper we shall consider lts whose states are terms over a given 
signature E. Although our results are easily extended to many-sorted signatures, 
for simplicity we focus on the one-sorted case. A one-sorted signature is a set 
E of operators together with an arity function ar^: if — >■ N assigning to each 
operator the number of arguments it takes. The subset of E consisting of the 
operators of arity n is denoted by . Operators in Eq are called constants. We 
denote by Ti;(X) the term algebra over E and variables in X (with X and E 
disjoint). We use for Tx'(0), the term algebra over E. For t £ Ti;(A), we 
write varff) for the set of variables that appear in t. Term t is said closed or also 
ground if var{t) = 0. We use a for a constant a{) £ Tx'(A'). 

A substitution is a mapping cr: A — >■ Ti;(A). It is closed if each variable is 
mapped into a closed term. Substitutions extend to mappings from terms to 
terms as usual: aft) is the term obtained by concurrently replacing all occur- 
rences of variables a; in t by cr(x). The substitution mapping Xi to U for i £ [1, n] 
is denoted by [ti/xi, ...,tn/xn]. Substitution a' can be applied elementwise to 
substitution a = [ti/xi, ...,t„/x„] yielding the composed substitution 



a-,a' = a'{[ti/xi,...,tn/xn]) = [a' fti) /xi, ...,a' ftn) /xn]- 
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A context t = C[xi, .,Xn] denotes a term in which at most the distinct vari- 
ables Xi,...,Xn appear. The term is then obtained by applying the 

substitution [ti/xi, ...,tnjxn] to C[xi, -,Xn]- A context can therefore be regarded 
as a function from n terms to 1. Notice that the Xi may as well not appear in 
C[x\, ...,Xn]- For example, the context X2[xi,X2,X3] is a substitution from three 
arguments to one, which is the projection on the second argument. 

A term is linear if each variable occurs at most once in it. Similarly, a = 
[ti/xi, ...,tnlxn] is linear if each ti is linear and var(ti) fl var{tj) = 0 for z j. 

Substitutions and their composition _ form a (cartesian) category Subsj;, 
with linear substitutions forming a monoidal subcategory. An alternative pre- 
sentation of Subsi; can be obtained resorting to algebraic theories. An algebraic 
theory [El is a cartesian category having ‘underlined’ natural numbers as ob- 
jects. The free algebraic theory associated to a signature S is denoted by Th[A]: 
the arrows from m to n are in a one-to-one correspondence with n-tuples of terms 
of the free A-algebra with (at most) m variables, and composition of arrows is 
term substitution. In particular, Th[A] is isomorphic to Subsi;, and the arrows 
from 0 to 1 are in bijective correspondence with the closed terms over A. As 
a matter of notation, we assume a standard naming of the m input variables, 
namely Xi,...,Xm- When composing two arrows s:m ^ k and t:k — >■ n, the 
resulting term s; t is obtained by replacing each occurrence of Xi in t by the 
z-th term of the tuple s, for z G For example, constants a, & in A are 

arrows from 0 to 1, a binary operator f{xi, X 2 ) defines an arrow from 2 to 1, and 
the composition (a, 6); (/(xi, 0 : 2 ), a^i); f{x 2 , a^i), where the angle brackets denote 
term tupling, yields the term /(a, /(a, &)), which is an arrow from 0 to 1. In fact, 

(a, 6); {f{xi,X2),xi)-, f{x2,xi) = {f{a, b),a); f{x2,xi) = f{a, f{a, b)) 

Monoidal theories stay to algebraic theories as linear substitutions stay to 
generic substitutions. More precisely, in monoidal theories variables can be nei- 
ther duplicated (as e.g. in f(xi,xi)) nor projected. Even though terms are for- 
mally annotated with the variables on which they are built, when no confusion 
can arise, we avoid such annotations, and also the use of angle brackets. 

LTS defined over closed terms of a given signature A and label alphabet A can 
be conveniently specified as collections of inductive proof rules, called transition 
system specifications. A transition rule a has the form 



ai, , On , 

Si r ti ... Sji r 




where the Si, fi, s and t range over Ti;(A) and the a^, a range over A. Transitions 
in the upper part of the rule are called premises, the one in the lower part 
conclusion. The rule a is closed if it does not contain variables. A transition 
system specification (tss) is a set of transition rules. 

A proof of a closed transition rule H/s — ^ t with H a set of (closed) premises, 
is a well-founded, upwardly branching tree whose nodes are labeled by closed 
transitions, the root is labeled by s — ^ t and if Hr is the set of labels for the 
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nodes above a node r with label Sr then either H^/sr is a closed 

rule that can be obtained as an instance of a rule in the TSS, or Sr tr G H 
and Hr = 0. A proof of a closed transition s — > < is a proof of 0/s — > t. The 
LTS associated to a TSS consists of the set of provable closed transitions. 

Among the formats that guarantee the fundamental property of ‘bisimilarity 
as congruence’ on the associated LTS we shall in the following recall the De 
Simone^ the GSOS and the tyft formats. We remind that an equivalence relation 
TZ over T^; is a congruence if it respects the algebraic structure of states given 
by S, i.e. if for all f G E 

SiTlU, for i e [ 1 , or(/)], implies /(si, ..., Sar(/)) ^ /(^i, iar(/))- 

Definition 3. Let E he a signature and A an alphabet of observations. A tran- 
sition rule is in De Simone format if it has the form 

{x^ yi\i G 1} 

f{xi,...,Xn) t 

where the and a are labels in A, f is a n-ary operator, I C {l,..n} and the 
variables Xi and yi are all distinct and form the set V. Moreover, the target 
t G does not contain Xi for i G I and is linear. A TSS is in De Simone 

format if all its rule are in such form. 

Observe that the source of the conclusion of a rule in De Simone format con- 
sists of a single function symbol. This fact is crucial in the proof that bisimilarity 
is a congruence for TSS specified in De Simone format. 

The positive GSOS format |3j extends De Simone rules in several ways: (1) 
multiple testings of the same argument (the Xi) are allowed in the premises; ( 2 ) 
tested arguments can appear in the target t of the conclusion; (3) the target t 
of the conclusion can be a non-linear context (variables can be used more than 
once). The tyft format H3] further generalizes the GSOS, by allowing generic 
terms ti as sources of the transitions in the premises. Notice however that the 
main restriction about the source of the conclusion still persists: allowing con- 
clusions of the form C[xi, ..., Xn] — > t could compromise the ‘bisimilarity as 
congruence’ property as the following example illustrates. 

Example 1. Consider a process algebra over the signature E = {nil, a._, d._, _ | _} 
with a ranging over a suitable set of channels A, where nil is the empty agent, 0 !._ 
and a..- are two complementary unary operator for action prefix (on the channel 
a) and _ | _ is a binary parallel composition operator. If we consider the tts 
consisting of the axiom \.x\ \ X.X 2 — ^ x\ \ X 2 plus the usual rules that propagate 
the T through the _ | _ operator (asynchronously), then it is obvious that a.nil ~ 
a.nil. However, if put in the context x\ \ a.nil, then a.nil \ a.nil — ^ nil \ nil, 
while a.nil \ a.nil cannot move. Hence a.nil \ a.nil 9 ^ a.nil \ a.nil. 

Dynamic bisimulation has been defined to reproduce the effect of run-time re- 
configuration in open ended systems. It extends the ordinary bisimulation-game 
by allowing moves that put the states under comparison in the same context. 



Open Ended Systems, Dynamic Bisimnlation and Tile Logic 447 



Definition 4. Given a lts L = (Tj;, 4, — >■), a dynamic bisimulation on L is a 
symmetric, reflexive relation C Tj; x Tu such that if s t then for any 
unary context C[_] (including the identity) and transition C[s] — ^ s' there exists 
a transition C[f\ t' with s' ~d t' . 

Two states s and t are dynamic bisimilar, written s ~d t, if there exists a 
dynamic bisimulation ~d such that s ^d t. Thus, w.r.t. Example ^ we have, e.g., 
a.nil 9^d amil. Dynamic bisimilarity is the coarsest congruence which is also a 
bisimulation. Note that context moves cannot be ‘observed’: they are part of the 
game but not of the lts. Even if not remarked in m. dynamic bisimulation can 
however be recasted in ordinary bisimulation over an extended system. Observe 
that the dynamic extension is an infinitary construction on the lts and is not 
expressed at the level of the tss, i.e., it remains at the ‘meta-level’. 

Definition 5. Given a lts L = (T^;, 4, — >■), its dynamic extension L is the lts 

C[-] 

(Ti;, vl, — >■ U ^), where s I (^[s] for all s and unary contexts C[_] . 

Proposition 1. s t in L iff s ~ t in L . 

The proof of Proposition E relies on the fact that if s makes a move C[_], 
then t can always simulate such move in a unique way. 

2 Tile Formats 

A tiles system is a tuple 72. = {'H,V,N, R) where H and V are monoidal categories 
with the same set of objects O-u = Ov, N is the set of rule names and i?: iV — >■ 
TtxVxVxTtisa function such that for all a G N, if R{a) = (s, a, b, f), then 
we have s:x ^ y, a:x ^ z, b:y ^ w, and t: z ^ w for suitable objects x, y, z 
and w. We shall write such rule either as the sequent a: s — ^ t, or as the tile 

X ^ >y 

“i “ if> 

z > w 

t 

thus making explicit the source and target of each arrow. The category T~L is 
called horizontal and its arrows configurations. The category V is called vertical 
and its arrows observations. The objects of 77 and V are called interfaces. 

Starting from the basic tiles R{a) of the system, more complex tiles can 
be constructed via horizontal, vertical and parallel composition. Moreover, the 
horizontal and vertical identities are always added to the system and composed 
with the basic tiles. All this is illustrated in Figure El Depending on the chosen 
tile format, 77 and V must satisfy certain constraints and suitable auxiliary tiles 
are added and composed with basic tiles and identities in all the possible ways. 
The set of resulting tiles (called flat sequents) define the flat tile logic associated 
to 72. We say that s — ^ t is entailed by the logic, written 72 h s — ^ t, if the 
sequent s — ^ t can be expressed as the composition of basic and auxiliary tiles. 
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Definition 6. Let TZ = N, R) be a tile system. A symmetric relation ~t 

on configurations is called tile bisimulation if whenever s t and 72. h s — ^ s' , 
then there exists t' such that 72 h t — ^ t' and s' t' . 

b 

The maximal tile bisimulation is denoted by and two configurations s 
and t are said to be tile bisimilar if s ~t t. 

We are particularly interested in considering tiles systems where the monoidal 
categories of configurations and observations are freely generated from suitable 
horizontal and vertical signatures, respectively, i.e., they are categories of sub- 
stitutions, as discussed in the previous section. 

The tile format proposed in the original presentation of tiles H2j is the so- 
called algebraic tile format that recollected the perspective of ordinary TSS: con- 
figurations are terms over a certain signature, and observations are the arrows 
of the monoidal category freely generated by certain labels (regarded as unary 
operators). Auxiliary tiles lift the horizontal cartesian structure to the horizontal 
composition of tiles. In the algebraic tile format basic tiles have the form: 



n ). 1 

t - 



where the Ui and a can be either labels (viewed as arrows from 1 to 1) or 
identities and s,7 S Ts{{xi, ■■■,Xn})- The idea is that each interface represents 
an ordered sequence of variables; therefore each variable is completely identified 
by its position in the tuple, and a standard naming x \, ..., Xn of the variables can 
be assumed for all interfaces. A typical auxiliary tile for the algebraic format is 



1 



1 









>2 

^a0a 

>2 



that duplicates the observation a (trigger of the tile) propagating it to two 
instances of the unique variable in the initial interface. We refer to H2| for more 
details. The algebraic tile format corresponds to SOS rules of the form 



Open Ended Systems, Dynamic Bisimnlation and Tile Logic 449 



where / C n}, C[x\, ...,Xn] and t/„] are contexts (corresponding 

to s and t in the tile), and all the yi and Xi are different if i G /, but = Xi 
otherwise. The correspondence follows since for all closed terms s and t and for 

idQ 

any label a,TZ\~s t if and only if the lts associated to the SOS specification 
includes the transition t — ^ s. 

The algebraic tile format is not uniform in the two dimensions, since "H is 
cartesian, whereas V is only monoidal (non symmetric). Since our idea is to ob- 
serve contexts by replicating (part of) the horizontal structure in the vertical 
dimension, we prefer (1) to renounce to the cartesian structure altogether, re- 
sorting to the simpler monoidal tile format HZ] where only linear contexts are 
allowed, or (2) consider the more general term tile format Hi) where also V is 
cartesian. Notice that monoidal theories suffice for expressing all closed terms, 
even though, as explained in HI, the term tile format is more expressive. 

Since in all tile formats the categories of configurations and observations are 
freely generated by the horizontal signature S and by (the signature associated 
to) the set of labels A, monoidal/algebraic/term tile systems are usually repre- 
sented as tuples of the form 72. = {S,A,N,R). 

According to the term tile format each basic tile has the form: 

n — y m 

k 

— 9 - 

with h e T^h(A„)™, g e T^H(Xfc), Te T^v(A„)^ and u S Txjv{Xm), where 
Xi = {xi, ...,Xi} is a chosen set of variables, is the horizontal signature of 
configurations and is the vertical signature of observations. Of course, if A 
contains only elementary actions, regarded as unary operators, then to = 1. We 

present tiles more concisely as logic sequents n< H 9 , where the number of 
variables in the ‘upper-left’ corner of the tile is made explicit (the values m and k 
can be recovered from the lengths of h and v). Again, a standard naming for the 
variables in the interfaces is assumed. For example, if the variable Xi appears in 
the effect u of the above rule, then the effect u depends on the ith component hi of 
the initial configuration. Analogously for the remaining connections. As already 
remarked, the same variable Xi denotes the ith element of different interfaces 
when used in each of the four border-arrows of the tile (as a matter of fact, only 
the occurrences of Xi in h and in v denote the same element of the initial input 
interface n). 

Auxiliary tiles for term tile systems consist of all term tiles n < h 9 
such that h, 9 , u and v are terms over the empty signature - and therefore also 
terms of T 2 ;v (A) and T^h (A) - such that h;u = v; g, i.e., all tiles that perform 
consistent rearrangements of a generic interface in the two dimensions. A typical 

auxiliary term tile is 1 <1 X\ x\,xi that consistently duplicates the unary 

interface. Observe that term tile format extends the positive GSOS format. 
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An interesting question concerns suitable restrictions of the monoidal and 
term tile formats such that tile bisimilarity yields a congruence. Two main prop- 
erties have been investigated in the literature for obtaining tile bisimilarity con- 
gruences: the basic source and the tile decomposition. Tile decomposition has a 
completely abstract formulation that applies to all tile systems. 

Definition 7. A tile system TZ = {H,V ,N,R) enjoys the decomposition prop- 
erty if for all arrows s: x ^ y G R and for all sequents s — ^ t entailed by TZ 

— if s = si; S 2 then there exists c G V and ti, ^2 G such that TZ\- si — ^ ti, 
TZh S 2 — ^ t 2 and t = <i; ^ 2 / 

— i/ s = Si 0 S2 then there exists 01,02,61,62 G V and 61,62 G R such that 

7^ h Si 61, TZ\~ S 2 62, o = Oi 0 02, 6 = 61 0 62 and 6 = 61 0 62; 



Proposition 2. If TZ enjoys the decomposition property, then tile bisimilarity 
is a congruence (w.r.t. the operations of the horizontal category, i.e., sequential 
and parallel composition) . 

Even though we did not address it explicitly, all the definitions we have 
given for tiles apply also to the case of term algebras modulo structural axioms 
(e.g., associativity and commutativity of parallel composition in CCS) and all 
our results can be immediately extended. 



3 Basic Source 

The results in this section mildly extend those of m- Since in this paper we 
consider equivalences on closed terms, we refine the notion of tile bisimulation 
to ground tile bisimulation. 

Definition 8. LetTZ= , N , R) be a monoidal (resp. term) tile system. 

A symmetric relation on closed configurations (i.e., elements ofT^n) is 

called ground tile bisimulation if whenever s ^gt and TZ\~ s — ^ s' , then there 
exists t' such that 72. h 6 — ^ 6' and s' t' . 

Ground tile bisimulation is the exact counterpart of ordinary bisimulation 
for LTS. It differs from tile bisimulation in that is not defined on contexts. (Since 
ground terms need no trigger, ground bisimulation tests only the effects they 
can produce.) The maximal ground tile bisimulation is denoted by ~g, and two 
closed configurations s and 6 are said to be ground tile bisimilar if s ~g 6. In 
fact we are interested in the LTS associated to tile systems. 

Definition 9. For TZ = ,N,R) a monoidal/term tile system, the LTS 

associated to TZ is L-ji = (T 2 ;h, T 2 ;v({a;i}), — >■) where s — ^ 6 iffTZ\~s — ^ 6. 
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The decomposition property can be refined and related to the ‘tile bisimilarity 
as congruence’ property. 

Definition 10. A term (resp. monoidal) tile system TZ enjoys the ground de- 
composition property if for any s € Tj;h and any sequent TZ h C[si] t with 
C a unary (resp. linear unary) eontext and si a ground term sueh that s = C[si], 
then there exists an observation c, a ground term t\ and a (resp. linear) context 

D such that TZ\- si —A ti and TZ h C[xi] — ^ IT>[xi\, with t = D[ti]. 

Theorem 1. Let TZ = , N, R) be a term tile system. The ground de- 

composition property implies that ground tile bisimilarity onTZ is a congruence. 

Proof. Standard. Define the congruence ~g as the minimal relation such that if 
s ~g t and C[x\] is a unary context then C[s] ~g C[t\. Obviously ~gC ~g. We 
then show that ~g is a ground tile bisimulation and, therefore, coincides with 
ground tile bisimilarity. In fact, let (7[s] ~g C[i\ for s,t G T^h with s ~g t and 

C[xi\ a unary context. By ground decomposition we have that if 72. h (^[s] —A s', 
there exist si G T^;h, an observation b and a unary context D[x\\ such that TZ h 

s si, TZ h C[xi] D[xi] and s' = 77[si]. Since s ^^t then TZ\~ t ti 
for some ti G T^;h with si ~g ti. By horizontal composition of tiles we then 

have TZ h C[f\ — ^ D[ti\. By definition of ~g we have that 77[si] ~g D[ti\. □ 



Theorem 2. Given a monoidal tile system TZ = {S,A,N,R), the ground de- 
composition property implies that ground tile bisimilarity is a congruence. 

Proof. Similar to the proof of Theorem ^ but requires ~g to be the minimal 
relation such that if s csig t and C[xi] is a linear (rather than generic) unary 
context then C[s] ~g C[t\. Observe that the congruence property then holds for 
generic contexts. In fact, if £)[_] is not linear, let X 2 , ..., x„] be the linear 

context obtained from D\f\ by replacing each occurrence of the hole by a 
different variable Xi. Then given any two closed terms s and t we have 



D[s] 




n n— 1 




^gD'[t,t,-,t] = D[t] 



since all contexts s, ..., s], D'[t, s, ..., s], . . . , D'[t , ..., t, _] are linear. □ 

The ground decomposition property can be enforced by syntactical con- 
straints on basic tiles. A tile system verifies the basic source property if the 
initial configuration of each basic tile consists of a single operator, rather than a 
generic context. 

Proposition 3. If a monoidal (resp. term) tile systemTZ enjoys the basic source 
property, then the ground tile bisimilarity on TZ is a congruence. 
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The proof of Proposition 0 consists of two steps: we first prove that basic 
source implies ground tile decomposition and then we conclude by exploiting 
Theorems 0 and 121 Although Theorems0and[3hold also when structural axioms 
are imposed on configurations, this is not necessarily the case for Proposition 0 
as the first part of the argument above may fail. We remark that for the monoidal 
tile format the proof coincides with that for the De Simone format, since the 
two formats are essentially the same. 

4 Dynamic Tile Bisimulation 

When the basic source property is not satisfied we are likely to have bisimilari- 
ties that are not congruences. This consideration applies to all the most diffused 
formats, unless the specification contains enough rules to distinguish context 
dependent redexes. For example, Corradini and Heckel in joint work with the 
second author jH] suggested one may deal with this situation via a closure opera- 
tion on the TSS rules. And Sewell in m when passing from reduction systems to 
transition systems performs such a closure by adding a transition labeled with 
C for each state s and context C which s can react with in order to perform a 
reduction. However, such closures are expressed at a meta-level, as they are not 
handled by adding rules to the original specification. Therefore, from a differ- 
ent perspective, dynamic bisimilarity is more satisfactory, since it allows for a 
concise definition of the congruence one is looking for. 

Tiles have the expressive power to reconcile finitary system specifications 
and dynamic bisimulation within the same perspective, i.e., by adding suitable 
auxiliary tiles. Moreover, the closure w.r.t. all contexts can be expressed simply 
by adding just as many basic tiles as the operators in the signature. Hence, if 
the signature is finite, so are the additional auxiliary tiles needed. 

Definition 11. Given a term (resp. monoidal) tile system TZ = (A^, , N, R) 

its dynamic extension TZ is obtained by adding for all n and for any operator 
f G E)) the auxiliary operator f to E^ and the following auxiliary tiles. 

n 1 1 

f(xi,.,Xn) - Ti ^4 

For t a generic horizontal context, we let t denote the corresponding vertical 
context on the extended signature, which is obtained by replacing each operator 
/ that appears in t by its vertical counterpart /, leaving variables unchanged. 
For the proof of the main theorem we need the following technical lemmas that 
require some acquaintance with the term tile format. A corresponding lemma 
can be proved for the monoidal tile format, by considering linear contexts only. 

Lemma 1. Given a term tile system TZ — {E^,E'^ ,N,R), for each context 
t: n ^ 1 we have 1Z\~ n< xi, Xn : — )■ t . 



t 
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Proof. The proof proceeds by induction on the (maximum) depth m of the 
tree-like representation of t. The base case m = 0 is trivial. If m > 1 then 
t = tfe), where / is a /c-ary operator of and are context 

with (maximum) depth strictly less than m. Then we can apply the inductive 

hypothesis to conclude that n< ^ ^ g Composing 

in parallel such sequents we get k ■ n < xi,...,Xk-n 5i, ■■■, Sk, where 

Si = ti[xn.(i-i)+i/ Xi , . . . , i.e., the Si are the ti with the variables 

suitably renamed according to the initial input interface. Then we can horizon- 
tally compose with the auxiliary tile of term tile systems that makes k copies of n 

inputs n< •••? -5 ^k-n, obtaining the sequent 

a = n< Xl, ..., x„ ti, ..., tfe . Finally we can compose it with the auxiliary 

Xi,...,Xk 

sequent of the extended system /3 = fc < xi, ...,Xfc s- /(xi, ...,Xfc) as in 

■ 1 - • > • 

i “ i ^ i 

• 1 - • • 

• 1 - • • 

where 7 is the horizontal identity for the effect of a and <5 is the vertical 
identity for the final configuration of a. The composition yields the sequent 

X~]_ ^TL ^1 ^Tt 

n< xi,...,Xn _ _ > /(ti, ..., tfc) = n <1 xi,...,x„ — and concludes 

the proof. □ 

A similar argument shows the following lemma. 

Lemma 2. Given a term tile system TZ = {S^, , N, R), for each context 

t: n ^ 1 we have TZh n<\ t x\ . 



Theorem 3. Let TZ = ,N,R) be a term tile system. The ground tile 

bisimilarity defined on TZ defines a congruence for TZ. 

Proof. First notice that the auxiliary tiles do not influence the definition of 
ground tile bisimilarity, which deals only with null triggers. Then, we prove 
that TZ enjoys the ground decomposition property from which we get the ex- 
pected ‘bisimilarity as a congruence’ property. In fact, given a generic sequent 

a: (7[s] — ^ t entailed by TZ, we can always construct two tiles with source s and 
C[xi] respectively that decompose a, as illustrated in Figure El □ 
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• id y • 



i 






Figure 3. 



Theorem 4. Ground tile bisimilarity on TZ, denoted by coincides with the 
dynamic bisimilarity on L-ji. 



Proof. We must show that = L-ji. The inclusion D L-ji follows directly 
from the technical lemmas above, whilst the inclusion C L-ji is more involved. 
The key point is showing that if an auxiliary ‘context’ tile gives rise to a new 
transitions, its label / appears manifestly, i.e., the use of auxiliary tiles cannot 
be ‘hidden’ inside the proof to originate unexpected reactions. To show this, 

ido ido 

let 7^ h s — =>■ t and suppose that the proof of s — =>■ t contains the auxiliary 

a a 

tile a = X /(^), for some operator /. We proceed by induction on the 

number k of such auxiliary tiles and then by case analysis. If A: = 0 then s 
t S L-]z Q L-]z. li k > 1 then we take one such auxiliary tile a in the proof 
and examine the following three cases: (1) the effect of a is propagated to the 
final effect a = A[f{a)] and thus can be observed; (2) the tile a is horizontally 

composed with (3 = f{x) xi and thus does not appear in a; (3) the effect / is 

X 1 

vertically composed with other effects that override it. In case (I), a corresponds 
to a context move in Ltj. In case (2), the composition of a with fj yields the 
vertical identity on / which is of course entailed in TZ. Finally, in case (3), the 
effect / can only be overridden because of structural axioms on observations, 
and in particular those involving projections. But then it can be shown that 
if projections are used that throw / away, then also the result of applying the 
context / in the intermediate state is thrown away in the proof. Therefore, we 
can always reduce to a proof with k — 1 auxiliary tiles for adding contexts and 
conclude the proof by inductive hypothesis. □ 



For monoidal tile systems the proof simplifies considerably, since case (3) 
cannot occur. We remark that if observations are subject to structural axioms 
(e.g., b]a = a for all observations 6), then such axioms must not be extended to 
the /, otherwise the case (3) of the proof could be compromised. 



Example 2. Let us take again the process algebra of Example [fl with _ | _ asso- 
ciative (but neither commutative nor with unit). Then a.nil \ a.nil ~g (3. nil \ 
j3.nil a.nil \ a.nil. While, e.g., t^ — g t /3 — g ta for t\ = nil \ X.nil \ X.nil \ nil. 
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Concluding Remarks 

We have proposed tile logic as a compositional framework suitable to deal with 
open ended systems, dynamic bisimulation and structural axioms on states. Such 
characteristics follows naturally from the abstract ‘geometrical’ concepts which 
tile configurations and observations are based on. In particular, the winning fea- 
ture, is the possibility of exploiting the analogy between horizontal and vertical 
arrows, making observations out of contexts. Moreover, dynamic bisimulation is 
handled via a finitary enrichment of the specification and the congruence proof 
has a simple pictorial representation that exploits ground decomposition. 

Following the lines suggested in this paper, one could limit run-time recon- 
figuration either to a sub-class of contexts or to a sub-class of configurations. 
Although we have not discussed the issue here, tile logic can deal with trace 
semantics as well. 
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Fibred Models of Processes: 
Discrete, Continuous, and Hybrid Systems 
(Extended Abstract) 



Marcelo P. Fiore 

COGS, University of Sussex, Falmer, Brighton BNl 9QH, UK 



Abstract. We present the rudiments of a unifying theory of general 
processes encompassing discrete, continuous, and hybrid systems. The 
main focus is on the study of process behaviour, but constructions on 
processes are also considered in some detail. In particular, we show that 
processes admit an abstract, conceptual treatment of bisimilarity via the 
notion of open map (as advocated by Winskel et ai). Furthermore, we 
present a tool-kit of categorical constructions on processes that can be 
regarded as the basis of a process description language. Within the ge- 
neral theory, typical operations of process calculi on discrete and hybrid 
systems are discussed. 



Introduction 

We advocate a fibred view of general processes; in which, roughly, a process 



X 

i 

c 

consists of a state space X (of states and their evolutions) varying according to 
a control C (of observations, time, etc.). By choosing the control C to consist 
of discrete observations or of continuous time we respectively obtain models of 
discrete and continuous processes (Section CJ; hybrid systems arise from mixed 
discrete-continuous controls (Section^. Further, we consider a simulation 



X X' 

c c 

between processes to be a map X — X' between the corresponding state spaces 
that respects the variation. 

We will see that the category U of processes and simulations comes equipped 
with a canonical choice of basic processes (or paths) V from which arbitrary 
processes can be constructed by appropriately glueing basic ones. This situation 
yields a model for concurrency 

V — >U 
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in the sense of Winskel et al. [,IJN W96l(iW97| . and is important for one can study 
process behaviour via the notion of open map (Section [3) • Moreover, we show 
that the category of processes U supports categorical constructions modelling 
typical operations of process calculi (Section 0 . 

We hope that this work will contribute towards the understanding of (and 
reasoning about) general processes and thereafter towards the development of a 
theory to aid the design of languages (and logics) for them. 

1 Fibred Models of Discrete and Continuous Systems 

We consider discrete and continuous systems from a fibred viewpoint. These 
systems are examples of linearly- controlled processes (or interleaving models, in 
the jargon of concurrency theory) in the sense of |BF^ . 



1.1 Discrete Systems 

We study the familiar model of concurrent computation given by transition 
graphs (or automata) as an instance of the general theory to be developed in 
Section 0 

Transition graphs. For a set A, the category TG^ of A-labelled transition 
graphs (or automata) has objects given by graphs dom,cod : E — > N equipped 
with a labelling function lab : if — > A, and morphisms given by pairs of functions 
between nodes and edges that preserve the domain, codomain and labelling 
of edges. Transition systems are extensional transition graphs, for which the 
function E — N x A x N : e I — > (dom(e), lab(e), cod(e)) is injective. 

Transition categories. Let M (A) = {A*,e,-) be the free monoid on a set 



A. An A-labelled transition category is a functor such that, for every 

M(A) 

e G G and ao,ai G M(A), if £(e) = oq • oi in M (A) then there exists a 
unique factorisation e = eg • ci in G for which £{eo) = oq and £{ei) = ai. The 
category TC^ has A-labelled transition categories as objects and morphisms 
G G' 

£l_ — ^ £'\^ given by functors G — S’ G' that preserve the labelling; that 

M {A) M(A) 

is, such that £ = £'h. 

Note that there is a bijective correspondence between transition graphs and 
transition categories as follows. A transition graph G = { N | E — - — A ) 

cod 



TG 

corresponds to the transition category £)). , where EG is the free category on 

M (A) 

the graph G (with set of objects given by the nodes of G and morphisms given 
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by paths in G) and where £{eo ■ ■ ■ e„) lab(eo) ■ ■ ■ lab(e„). Conversely, a transi- 
tion category corresponds to the transition graph | G | j E > A 

M (A) cod 

where £1 { e G G | £{e) G A }. 

Proposition 1.1. For any set A, the categories TG^ and TC^ are equivalent. 

Thus, transition graphs and transition categories provide the same model of 
parallel computation. However, the fibred representation provided by transition 
categories embodies the dynamics of the systems. Indeed, as we show below, 
TCa comes equipped with canonical categories of computation paths which in- 
duce behavioural notions of equivalence. 

Discrete path categories. The category M (A)^ has A* as set of objects and 
morphisms a — > j3 given by pairs {p, 7 ) in A* such that (f-a-'y = (3. One thinks 
of a morphism a — ^ (3 as& pre- and post- extension of a yielding (3. The category 
M(A)^ is the subcategory of M(A)^ consisting of post-extensions; viz., mor- 
phisms of the form (£, 7 ), where £ denotes the empty string. Note that M(A)^ 
is the poset P (A) = (A*, <) of sequences of elements of A ordered by prefix. We 
have faithful functors 



P(A)^M(A)g^TC^ 

where the embedding M(A)g ^ TC^ maps a sequence a to the transition 

I“] ao < ag . ai 

category J, J , where |ai • • • • • • a„] is the poset: £ < oi < . . . < 

M(A) cn 

(ai ■ ■ ■ Oi) < . . . < (ai ■ ■ ■ Oi ■ ■ ■ On) . 

Open- map bisimilarity. The functors P (A) — > TCyi and M (A)^ TCa 
provide canonical choices w.r.t. which to consider the notion of open map j.TNW9fi) 
and hence study process behaviour. Indeed, the notion of open map w.r.t. the 
inclusion P (A) — ^ TC^ corresponds to functional bisimulation (c/. [INC95j l. 
whilst the notion of open map w.r.t. the embedding M(A) ^ — >■ TC ^ amo- 
unts to functional back- and- forth bisimulation on states (c/. [IDNMV^ l — see 
the general theory developed in Section El ‘§ Bisimulation’. Moreover, the fibred 
setting allows a natural treatment of weak bisimilarity; this is shown in Subsec- 
tion |^2‘§ Saturation’. 



1.2 Continuous Systems 

We provide a treatment of continuous systems emphasising the similarities with 
the above view of discrete systems. 

Durations. The notion of duration was introduced by Lawvere in connection 
to investigations of processes in continuum physics, see jl ja,w 8 fi| . 
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Let R be the additive monoid of non-negative real numbers. A duration is 

X 

a functor d\, such that, for every e G X and to,ti G R, if d{e) = to + ti in 



R 



R then there exists a unique factorisation e = eg • ei in X for which c?(eo) = to 
and d{ei) = ti. The category Dur has durations as objects and morphisms 



such that d = d'h. 



d ' given by functors X 

R R 

As remarked in |l ;awS6] . an important example of duration is given by so- 
lutions of differential equations. For the simplest such example, consider the 
differential equation 

dx 

-dt=“ 

whose initial value problem with initial condition 

a;(0) = s 

has unique solution 

s(t) = s e“‘ . 



C t 

T Ogf 

It can be regarded as the duration ^ j , where | S | = R and, for 

R t 

t G M>o, we have sq — ^ si in S if and only if there exists a solution of (m 
cr : I — M and tg < ti in / such that cr(tg) = sg, cr{ti) = si, and t\ — to = t 
(be., if and only if Sg(t) = Si). (Compare this construction with the flow of the 
differential equation IHBZil.) 

Continuous path categories. The category R§ has K>g (the set of non- 
negative real numbers) as the set of objects and morphisms t — > t' given by pairs 
(x, y) in R>g such that x + t + y = t' . A morphism t — t' can be thought of as a 
way of placing an interval of length t within an interval of length t' . The category 
R> is the subcategory of Rg consisting of initial placements; viz., morphisms 
of the form (0,y). Note that R> is the poset T = (R>g,<). We have faithful 
functors 

T — ^ Rjj — ;• Dur 

where the embedding R.g — > Dur maps a non-negative real number t to the 

1*1 X < y 

difference duration , where |t] is the interval poset ([0,t],<). 

R y - X 

The concept of open map of durations, w.r.t. the path categories T and Rg-, 
yields natural notions of bisimilarity; consider, for instance, the general results 
of Section 121 ‘§ Bisimulation’ in the context of durations. 
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2 Fibred Models of Processes: General Theory 



Processes. A bundle over a small category C is a functor ^ . The category 



Cat/c has bundles over C as objects and morphisms — ;■ f'j^ given by 

c c 

functors X X' such that / = f'h. The notion of process to be considered 
is given by certain bundles, known as ufl {unique factorisation lifting) func- 
tors (see , e.g., |T^aw86ISS88IStr96 | ) . 



A ufl functor f^ (over a small category C) is a functor with the property 
c 

that, for every e C X and aQ,ai G C, if /(e) = Oq • Oi in C then there exists 
a unique factorisation e = eo • ei in X for which /(eg) = og and /(ei) = a\. 
The category Ufl/c is the full subcategory of Cat/c with objects given by 
ufl functors. 

The discrete and continuous systems of SectionQ] provide examples of proces- 
ses over monoids {viz., TC^ = Ufl/M(A) and Dur = Ufl/R_) and as such force 
a global view of the state of a process. The extra generality of allowing processes 
over arbitrary categories may be used, for example, to provide a distributed view 
of the state. For instance, discrete processes operating in three modes (say 0, 1, 2) 
and such that in mode i can either keep on computing in that mode or proceed 
to compute in mode (*-bl) mod 3, can be naturally modelled as ufl functors over 
the free category on the following graph of modes: 



mo,l 




u 



02 



(oi G Ai) 



where, for 0 < i < 2, the sets Ai are the observable actions in mode i and the 
transitions mod 3 represent the allowable mode changes. 



Path categories. The categories of processes Ufl/c come equipped with cano- 
nical categories of paths generalising the constructions of Section Q 

Mac Lane’s twisted arrow category MTTm page 223] Cg has objects given 
by morphisms in C and morphisms a. — \ j3 given by pairs of maps (/J, 7) in C 
such that if ■ a - "f = /3; such a morphism can be regarded as a context, ip ■ (_) • 7, 
within which a can be placed to yield (3. The category C> is the subcategory of 
C.g. consisting of the morphisms of the form (id, 7) . We have faithful functors 



C, 



Jc 



c. 



-4 Ufl 



/c 



( 2 ) 
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where the embedding — S-Ufl/c maps: 

|a] (cto,a,ai) 

— an object (ag ai) G I Cg | to the ufl functor Uq,^ J , where the 

C a 

interval category |a] has objects given by factorisations (oq — ^ a — 4 a{) 
of a in C, and morphisms e : (ao,a,ai) — > {a'g,a',ai) given by maps 
e : a — >■ a' in C such that ag ■ e = a'g and ai = e • ; and 

— a morphism : a — > [3 in to the functor 1^5,7] : |o;] — > |/3] : 

{ag,ai)\ — >{(f-ao,ai -7). 

Note the (natural) bijective correspondence 

Ufl/c(u„,/) - /-1(a) (3) 

h I — > h ^(id, a) (a, id)^ 

intuitively stating that computation paths (or runs) of shape a of the process 

I«] X 

/ {viz., morphisms ^ ) correspond to evolutions in the state space 

c c 

taking place over a {viz., morphisms e S X such that /(e) = a). 

In C^, the commutative diagrams 



idb 

(a,id|,) X V (idt,/3) 

^ I L I * /T^\ 

Q, p {a — > b — > c in (L) 

(ida./3^N y^a.id,) 

a • fj 



( 4 ) 



are pushouts, and the embedding up exhibits Ufl/c as the free cocompletion of 
respecting these pushouts. More precisely, we have the following universal 
characterisation. 



Theorem 2.1. (cf. f B F9fA § 2]) The category Ufl/c cocomplete and the em- 
bedding uc : Cg — > Ufl/c preserves the pushouts (^. Moreover, for every 
cocomplete category C and functor F : Cg — > C preserving the pushouts we 
have that 

1. there exists a cocontinuous functor F' : Ufl/c — > C such that F'uc = F, 
and 

2. for every natural transformation ip : F'uc — t G'uc, where F' and G' are 
cocontinuous functors Ufl/c — >C, there exists a unique natural transfor- 
mation Lp' : F' — > G' such that p'uc = p. 

Condition Q) asserts that every functor C.g. — ;■ C preserving the pushouts (0J 
has (up to isomorphism) a cocontinuous extension Ufl/c — t C; whilst condi- 
tion m guarantees that such extensions are unique up to canonical isomorphism. 
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Theorem o is important for it shows that the processes in Ufl/c consist 

of properly glued copies of the basic processes, , arising from computation 

c 

X 

paths. Formally, we have that, for every G \ Ufl/c |, 

c 



/ = colim 




where /» : el — >f{e). 

V 

Pointed processes. For G | Ufl/c |, the category u/Ufl/c has: objects 

c 

given by triples (x, X, /), where x : U — > X is a functor and is a ufl functor, 

c 

such that fx = u; and morphisms (x,X, /) (x',X',/') given by functors 

X X' such that / = fh and hx = x' . 

0 

For instance, ^ /Ufl/c = Ufl/c- Further, note that for a G C, by the cor- 

c 

respondence 021), Uc/Ufl/c is (isomorphic to) the category with objects (e G X, 

f[ G I Ufl/c I ) such that /(e) = a, and with morphisms (e, /) (e', /') given 

c 

by maps / /' in Ufl/c such that h{e) = e! . In particular, Ue/Ufl/M(A) is the 

category of transition categories with a designated object (typically modelling 
an initial state). 

We remark that the situation o induces the following one: 
a/C> — — ^-Ua/Ufl/c (a G C) 

between pointed path categories and pointed processes. 

Linearly-controlled processes. We turn attention to interleaving models. 
These consist of processes varying over a category whose basic processes (or 
paths) are linear. Formally, we say that a category C is path-linearisable if, for 
every morphism a G C, the interval category |a] is a linear preorder, and we 
refer to ufl functors over path-linearisable categories as linearly- eontrolled pro- 
cess (see IHHi § 4]). 

Theorem 2.2. f\BF9!A § 4 -]) ^ category C is path-linearisable if and only if the 
functor |_] : — > Cat : a I — > |a] preserves the pushouts m- 
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(In the setting of mm. this theorem corresponds to the equivalence between 



the properties (CFI) and (IG).) 

Examples of path-linearisable categories are the monoids M (A) (for any set 
A) and R, the posets P (A) (for any set A) and T, and free categories on graphs. 
Moreover, path-linearisable categories are close under the sum and tensor (see 
‘§ Tensor’ in Section of categories, but not under product. 

There is a universal way in which to transform a bundle over a path-linear- 
isable category into a linearly-controlled process. The construction is given by 
the proposition below. 

Proposition 2.1. For a path-linearisable category C, the embedding 

Ufl/c*^ — ^Cat/c has a right adjoint V : Cat/c — ^Ufl/c- 



c G Cat /c such that k = co |idc, t 1 and co |y, idc'] = k'-, identities 



id(c,fc) = (idcjfc); and composition (7,0) • (7^ c') = (7 • 7^c • o') given by the 
following construction 



It follows from the proposition that, for C path-linearisable, colimits in Ufl/c 
are computed as in Cat In particular, we have the following corollary. 

Corollary 2.1. For a linearly-controlled process , 



[idc] 

objects given by pairs (C, i '^idcj, 

c 

morphisms (C,k) — {C',k') given by pairs (7,0) with 7 : C — C in C and 



One can explicitly define U 




where K has: 






M p° lYl 




K 



c 



X ^ colim ( X,. dK H Cat 
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X X' 

Bisimulation. For bundles and f'^ , a relation i? C | X | x | X' | is said 

c c 



to be a simulation {cf. jMil89| ) if xq R x'q implies that, for every xq x in X, 
there exists Xq — > x' in X' such that /(e) = f'{e') and x R x' . 



X X' 

Proposition 2.2. For a path-linearisable category C, consider — > f'_^ in 

c c 

Ufl/c- The functor h is open w.r.t. the inclusion C> — )• Ufl/c */ only if, 

for every Xq G | X | and every h{xo) x' in X', there exists Xg — ^ x in X such 
that h{e) = e'. 



For processes ^ and ^ , we say that x S | X | and x' G | X' | such that 
c c 

/(x) = f'{x') are open-map bisimilar if there exists a span of functors 




c c c 



which are open w.r.t. the inclusion C> — > Ufl/c such that h{w) = x and 
h'{w) = x', for some w £ | W |. 



Corollary 2.2. For linearly- controlled processes, the notions of bisimilarity and 
of open-map bisimilarity coincide. 



Relation to presheaf models. The category of processes Ufl/c is intimately 
related to the presheaf categories over the path categories (see EESni) and 
C>. 

We have faithful functors 

Ufl/c , (5) 

where i(/) Ufl/c(u(_), /) (c/. Q) and j(P) Pj°^, that correspond to 

unfolding a process into the presheaf of its computations (or runs) . This is clearly 
seen when C = M (y4), in which case C> = P (>1) is (equivalent to) the category 
SF/i of ^-labelled svnchronisation forests tsee |,lNW96j'l and the comoosite o 
amounts to the functor 

TC^^SF^ 



that unfolds a transition graph into the forest of synchronisation trees rooted at 
each state (see |WJN95IIJN Whfi) '). 
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In more detail, we have the following relationship between the fibred view- 
point and the presheaf models 



where the functors ji and a are obtained as free cocontinous extensions. This 
diagram allows us to relate the various notions of open map (and hence of bisi- 
milarity) as follows. 

Proposition 2.3. The functors ji, a, i, and j preserve open maps (w.r.t. the 
embeddings yc>, o,nd uc/ 

Corollary 2.3. A functor h in Ufl/c uc-open if and only if the map i{h) in 
isyc^-open. 

3 Constructions on Processes 

We present a tool-kit of categorical constructions on processes that can be re- 
garded as the basis of an abstract process description language, cf. (WN95L §2] 



Composition. We consider two versions of operations obtained by post-compo- 




ji H 




and EM3. 



sition. 



1. For a ufl functor g : A — >-lB, we define 




fi ^ gfi 



A 



B 



2. For a functor g : A — where ® is path-linearisable, we define 



Ug : Ufl/A 
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We will also need the following operation given by pre-composition: for 
h : u — > V in Ufl/c, we define 

I — S’ (yh,Y,g) 

Pullback. For a functor h : A — > B, we define the pullback (along h) functor 

h* : Ufl/B ^ Ufl/A 

X /t*X 

fi ^ h*fi 

B A 



where h* f is the first-projection functor from the subcategory h*X of A x X with 
morphisms (a, e) : (a,x) — > (a',x') such that h(a) = /(e). 

Saturation. The operation of saturation {cf. HIM!) by a functor h : A — > B, 
where A is path-linearisable, is given by the monad on Ufl/A induced by the 
following composite of adjoints: 

UO/acZlI^ Cat /a Cat/B . 

Sh 

Product. The product functor is defined as follows 

_ X ^ : Ufl/A X Ufl/B 

X Y 

fi, 9i 

A B 



Ufl/AxB 
X X Y 

f X gi 

A X B 



Sum. The sum functor is given by the construction 

— + = s Ufl/A X Ufl/B > Ufl/A+B 

X Y x-HY 

fi , 9i I f + 

A B A-I-B 

and induces the fibred coproduct functor (_) + (_) given by the composite 

Ufl/c X Ufl/c A Ufl/c+c Ufl/c . 

Pushout. For h : u — > v in Ufl/c where C is path-linearisable, we have a 
pushout (along h) functor 

h._ : w/Ufl/c ^ v/mi/c 

{x,X,f)^{h.x , V.X, v.f) 
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defined by the diagram 




Tensor. The tensor category (see, e.g., § 2] or Pt^ § 6]) A(g)B of the 

small categories A and B has objects given by pairs a ® 6, with a € | A | and 
6 G I B I, and morphisms given by interleaved (or shujfled) sequences of non- 
identity composable maps in A and B; i.e., morphisms are formal sequences of 



non-identity morphisms of the form a\®bi 



ai 06 ] 



> 02 (8) 



02®/?! 






or ai I 



aiQ/3i 

I Oi )■ Oi 



01062 



-> 02 ' 



The tensor of categories has a universal characterisation as the following 
pushout square of functors 



I A I I TTA I ^ J® 

I A I X I B I i- 

JaxW 



A X 



po 



_0^ 



- 0 = 



from which one easily sees that the tensor category of two monoids is their 
coproduct in the category of monoids. Thus, in particular, 

M (A) (g) M (S) ^ M (A -h B) . 



Moreover, the construction induces a tensor functor between categories of 
processes: 

_ (g) _ : Ufl/A X Ufl/B > Ufl/AigiB 




3.1 Examples 

We show that, for transition categories, the above categorical constructions yield 
typical operations of process calculi (c/. jWN95| l. Hybrid systems are discussed 
in Section 0 
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Relabelling. The operation of relabelling according to a function p : A — > B 
corresponds to the functor 

-{p} =' ^M(p)U : Ufl/M(A) ^Ufl/M(B) 

where M (p) is the free homomorphic extension of the function mapping: 
a G A I — ^ p{a). 

Restriction. Let A C B, and write l : A — )• B for the inclusion function. The 
restriction operation corresponds to the functor 

_ r t =' M (r)* (_) : Ufl/M(B) ^ Ufl/M(A) • 

Saturation. The main example of a saturation monad is given by the con- 
struction that associates a transition graph with the transition graph obtained 
from saturating transitions by the hidden insertion of silent actions; i.e., the 
construction introduced by Milner to define weak bisimulation (see 

In our setting (as in jFCWflfl] , where an account of weak bisimilarity for 
presheaf models required, crucially, the adoption of a fibred view of processes), 
the saturation by silent actions arises as the saturation monad (of Section 0‘§ Sa- 
turation’) by the homomorphism M (A -|- {r}) — )• M (A) that hides the si- 
lent T actions (viz., the free homomorphic extension of the function mapping: 
a G A I — > a, T I — > e) . 

Generalising from the above discussion, we have the following general defini- 

X 

tion of weak bisimulation w.r.t. a hiding functor h : A — >-B: for processes 

A 

and /'^ , we say that a: G | X | and a:' G | X' | are h-bisimilar (c/. |FGW99j l 

A 

whenever rjf{x) and rjf'{x') are open-map bisimilar, where p denotes the unit of 
the /i-saturation monad. 

Parallel composition. We discuss synchronous and asynchronous compositions 
of processes. 

Synchrony. As observed in [WN951 § 2.2.4], versions of parallel composition can 
be obtained from the product operation by restriction and relabelling. Roug- 
hly, the synchronous parallel composition of processes can be specified as the 
combined operation 

Sp{a*{_ X ^)) : Ufl/A X Ufl/B ^ Ufl/c 

where a : S — > A x B is a syncronisation functor and p : S — > C a relabelling 
ufl functor. 

Synchronisation a la CCS is given by the functor 



4 Ufl 



X =)) {p) ■ Ufl/M(A+{T}+B) X Ufl/M(S-|-{r}+C) 



■/M(A+{r}-|-C) 
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where the synchronisation map 

M{A + {(r , *)} + B + {(*,r)} + C) ^M{A + {t} + B) x M{B + {t} + C) 

is the free homomorphic extension of the function mapping: a £ A \ — )■ (a,e), 
(r, *) I — ^ (r, e), b £ B I — )■ {b, b), (*, t) I — )■ {e, t), c £ C I — ^ (e, c); and where the 
relabelling function 

A + {(t, *)} + B + {(*, t)} + C A+{t} + C 

is the following mapping: a £ A i — ^ a, (r, *) I — ^ t, b £ B \ — )■ r, (*,r) I — )■ r, 
c £ C I — > c. 

On the other hand, synchronisation a la CSP, corresponds to 

(o’*(- X =)) {p} ■ Ufl/M(A+{T}) X Ufl/M(A+{r}) > Ufl/M(A+{T}) 

where the synchronisation map 

M{A + {(r, *)} + {(*, t)}) ^ M (A + {t}) xM{A + {r}) 

is the free homomorphic extension of the function mapping: a £ A I — > {a, a), 
( t , *) I — >■ (r, e), (*, r) I — (e, r); and where the relabelling function 

A + {(r,*)} + {(*,r)} — ^ A + {r} 

is the following mapping: a £ A I — > a, (r, *) I — > r, (*, r) I — r. 

Asynchrony. The asynchronous parallel composition of processes can be descri- 
bed along the above lines as the functor 

a* : Ufl/]v[(y!i) X Ufl/]v[(B) — 5- Ufl/M(A-i-B) 
where the synchronisation map 

M{A + B) — ^ M (^) X M (B) 

is the free homomorphic extension of the function mapping: a £ A I — )■ (a,e), 
b £ B I — ^ (e, b). 

However, there is a more direct description; namely, as the tensor functor 

0 

Ufl/M(A) X Ufl/M(B) ^ Ufl/M(A)0M(B) • 

Non-deterministic sum. For sets A and B, let ca ■ A — ^ A U B and 
Lb ■ B — AVJ B he the respective inclusion functions. 

The non-deterministic sum of transition categories is given by the following 
construction: 



(-m(^ub)=) ° = : Ufl/M(A) xUfl/M(B) 



i/M(AUB) 



Prefix. The prefix operation on transition categories, for a £ A, arises as the 
composite 



((_)||e,al) O (Kel-) : Ue/Ufl/M(A) 



Ue/Ufl/M(yl) • 
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4 Fibred Models of Hybrid Systems 

Hybrid systems are typically described as mixed discrete-continuous systems, 
and our development has been aimed at putting this slogan in mathematical 
form. 

We have considered the categories Ufl/c as fibred models of processes, and 
observed, in Subsections ll.ll and ll"^ that for C = M (4) the resulting processes 
correspond to discrete systems, whilst for C = R they correspond to continuous 
ones. Models of hybrid systems arise from considering categories C where discrete 
actions and continuous evolutions are interleaved. The simplest example is the 
model 

Ufl/M(A)®R 

consisting of processes J, where the evolutions in the state space X factor 

M (A) 0 R 

uniquely as interleavings (or shuffles) of discrete and continuous evolutions. Ex- 
amples of such processes arise from hybrid automata (c/. |Hen96l Definition 1.3]). 

The models of discrete, continuous, and hybrid systems introduced relate as 
follows 

Ufl/M(A) Ufl/R 




Thus the notions of open-map (and hence of bisimilarity) for hybrid systems 
extend that of discrete and continuous systems. 

We remark that other non-standard models of hybrid systems can also be 
accomodated. For instance, hybrid systems with two modes of operation (discrete 
and continuous) where mode changes are explicit, can be seen as processes over 
the category generated by the diagram 

C 

^ (^^ D ^ C ^ (a G j4, t G M>o) 

d 

under the condition that the composite t -t' he t-\- 1' , for all t,t' G K>o- 

4.1 Constructions on Hybrid Systems 

We exemplify the use of the constructions of Section 0 on hybrid systems. 

Parallel composition. The following operation corresponds to the parallel com- 
position of hybrid systems as considered in |Hen96l § 1.4]: 

-lUfl 



0"*(_ X _) : Ufl/M(A-|-B)®R X Ufl/M(B+C)®R 



■/M(A+B+C)0R 
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where the synchronisation map 

M (A + S + C) (g) R ^ (M (A + B) ® R) X (M (B + C) 0 R) 

is the unique homomorphism such that: a £ A I — > (a, e), b G B I — > (b,b), 
c€ C I — ^ (e, c), t G R>o I — ^ (t, t). 

Time abstraction. We consider two versions of time abstraction. They both 
arise as the operation 

— ^ 

UpO/l : Ufl/M(A)®R 1 Ufl/M(A+{r}) j 



where the diagram 

M > M (A) 0 R 

p p 

M(A+{r}) 

is a pullback square, for suitable choices of p and h. 

1. The following version of time abstraction corresponds to the one considerd 
in [Hentltij Definition 1.3]. Take C = M (A) (g)E, where E is the monoid given 
by the following table 



E 


id 


r 


id 


id 


r 


r 


r 


r 



together with p : M (A) (g) R — ;■ M (A) (g) E the unique homomorphism 
mapping: a £ A i — ^ a, t £ M>o I — r and h : M (A + {r}) — ;■ M (A) (g) E 
the free homomorphic extension of the function mapping: a £ A I — > a, 

T I > T. 

2. A slightly simpler time-abstraction operation is obtained by taking C = 
M (A) together with p : M (A) (g) R — M (A) the unique homomorphism 
mapping: a £ A \ — > a, t £ R>o I — e and h : M.{A+ {r}) — M (A) the 
free homomorphic extension of the function mapping: a £ A i — > a, r I — > e. 

In this case, for a hybrid system /J, , the transition category Up ( /i /) 

M(A)®R ^ ^ 

obtained by time abstraction corresponds to the transition graph with set of 
nodes | X |, set of edges { (a,e) £ AxX | /(e) = a\I f{e) = t-a\/ f{e) = a ■ t'V 
/(e) = t • a • f' , for some t, t' £ M>o } U { (r, e) £ {r} x X | /(e) = e V 
/(e) = t , for some t £ K>o }, with domains and codomains inherited from 
X and with labelling function given by first projection. 
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Abstract. All bisimulation problems for pushdown automata are at 
least PSPACE-haxd. In particular, we show that (1) Weak bisimilarity 
of pushdown automata and finite automata is PSPACE-hard, even for 
a small fixed finite automaton, (2) Strong bisimilarity of pushdown au- 
tomata and finite automata is PSPACE-hard, but polynomial for every 
fixed hnite automaton, (3) Regularity (finiteness) of pushdown automata 
w.r.t. weak and strong bisimilarity is PSPACE-hard. 

Keywords: Pushdown automata, bisimulation, verification, complexity 



1 Introduction 

Bisimulation equivalence plays a central role in the theory of process algebras 
m- The decidability and complexity of bisimulation problems for infinite-state 
systems has been studied intensively (see m for a survey). While many algo- 
rithms for bisimulation problems have a very high complexity, only few lower bo- 
unds are known. Jancar showed that strong bisimilarity of two Petri nets 

m and weak bisimilarity of a Petri net and a finite automaton is undecidable. 
Stffbrna PHI showed that weak bisimilarity for Basic Parallel Processes (BPP) 
is AfP-hard and weak bisimilarity for context-free processes (BPA) is PSPACE- 
hard. (BPA are a proper subclass of pushdown automata.) However, it is still an 
open question whether these two problems are decidable. So far, the only known 
lower bound for a decidable bisimulation problem was an EXPSPACE-lowei bo- 
und for strong bisimilarity of Petri nets and finite automata HS|, that follows 
from the hardness of the Petri net reachability problem HE). 

For bisimulation problems where one compares an infinite-state system with a 
finite-state one, much more is known about the decidability and complexity than 
in the general case of two infinite-state systems PJ. Also the complexity can be 
much lower. In particular, weak (and strong) bisimilarity of a BPA-process and 
a finite automaton is decidable in polynomial time CZl, while weak bisimilarity 
of two BPA-processes is PSPACE-ha,rd j2H]- 

However, this surprising result does not carry over to general pushdown au- 
tomata. We show that strong and weak bisimilarity of a pushdown automaton 

J. van Leeuwen et al. (Eds.): IFIP TCS2000, LNCS 1872, pp. 474-^^ 2000. 

© Springer- Verlag Berlin Heidelberg 2000 
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and a finite automaton is PSPACE-]\aid. (These problems were already known 
to be in EXPTIME |I1].) For weak bisimilarity this hardness result holds even 
for a small fixed finite automaton, while the same problem for strong bisimila- 
rity is polynomial in the size of the pushdown automaton for every fixed finite 
automaton. These results also yield a P SPACE lower bound for strong bisimi- 
larity of two pushdown automata, a problem that has recently been shown to 
be decidable by Senizergues P7] (the proof in 1211 uses a combination of two 
semidecision procedures and does not yield any complexity measure). 

The problem of bisimilarity is also related to the problem of language equi- 
valence for deterministic systems, e.g., the problem of language equivalence for 
deterministic pushdown automata 1^01 • See Section 0 for details. 

Furthermore, we prove a P SPACE lower bound for the problem of regularity 
(finiteness) of pushdown automata w.r.t. weak and strong bisimilarity. 

Thus no bisimulation problem for pushdown automata is polynomial (unless 
PSPACE is V). This shows that there is a great difference between pushdown 
automata and BPA, although they describe exactly the same class of languages 
(Chomsky-2). 

2 Definitions 



Let Act = {a, b,c, . . .} and Const = {e, X, Y, Z, . . .} be disjoint countably infinite 
sets of actions and process constants, respectively. The class of general proeess 
expressions G is defined by if ::= e | AT | E\\E \ E.E, where X G Const and e is a 
special constant that denotes the empty expression. Intuitively, is a sequential 
composition and ‘||’ is a parallel composition. We do not distinguish between 
expressions related by structural congruence which is given by the following 
laws: and ‘||’ are associative, ‘||’ is commutative, and ‘e’ is a unit for and 

II’- 

A process rewrite system (PRS) m is specified by a finite set A of rules 
which have the form E A- F, where E,F £ G, E ^ e and a G Act. Const{A) 
and Act{A) denote the sets of process constants and actions which are used in 
the rules of A, respectively (note that these sets are finite). Each process rewrite 
system A defines a unique transition system where states are process expressions 
over Const{A). Act(A) is the set of labels. The transitions are determined by A 
and the following inference rules (remember that ‘||’ is commutative): 

{E A F) € A E A E' E A E' 

eAf e.fAe'.f e\\f a E'\\F 

We extend the notation E A F to elements of Act* in a standard way. Moreover, 
we say that F is reachable from E ii E ^ F for some w G Act* . 

Various subclasses of process rewrite systems can be obtained by imposing 
certain restrictions on the form of rules. To specify those restrictions, we first 
define the classes S and P of sequential and parallel expressions, composed of all 
process expressions which do not contain the ‘||’ and the operator, respectively. 
We also use T’ to denote the set of process constants. 
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The hierarchy of process rewrite systems is 
presented in Fig. Q the restrictions are specified 
by a pair {A, B) , where A and B are the clas- 
ses of expressions which can appear on the left- 
hand and the right-hand side of rules, respec- 
tively. This hierarchy contains almost all classes 
of infinite state systems which have been studied 
so far; BPA (Basic Process Algebra, also cal- 
led context-free processes), BPP (Basic Paral- 
lel Processes), and PA-processes are well-known 
P, PDA correspond to pushdown automata (as 
proved by Caucal in p]), PN correspond to Pe- 
tri nets, PRS stands for ‘Process Rewrite Sy- 
stems’, PAD and PAN are artificial names made 
by combining existing ones (PAD = PA-I-PDA, 
PAN = PA-^PN). 

We consider the semantical equivalences 
weak bisimilarity and strong hisimilarity m over transition systems generated 
by PRS. In what follows we consider process expressions over Const (A) where 
A is some fixed process rewrite system. 

Definition 1. The action t is a special ‘silent’ internal action. The extended 
transition relation ‘=> ’ is defined by E ^ F iff either E = F and a = t, or 

E ^ E' E" ^ F for some i,j G INq, E\E” G G. A binary relation R over 
process expressions is a weak bisimulation iff whenever (A, F) G R then for every 
a G Act: if E E' then there is F ^ F' s.t. {E' , F') G R and if F A A' then 
there is E ^ E' s.t. {E',F') G R. Processes E,F are weakly bisimilar, written 
E Ki F, iff there is a weak bisimulation relating them. Strong bisimulation is 
defined similarly with — >■ instead of Processes E,F are strongly bisimilar, 
written E F , iff there is a strong bisimulation relating them. 

Bisimulation equivalence can also be described by bisimulation games bet- 
ween two players. One player, the ‘attacker’, tries to prove that two given pro- 
cesses are not bisimilar, while the other player, the ‘defender’, tries to frustrate 
this. In every round of the game the attacker chooses one process and performs 
an action. The defender must imitate this move and perform the same action in 
the other process (possibly together with several internal r-actions in the case of 
weak bisimulation). If one player cannot move then the other player wins. The 
defender wins every infinite game. Two processes are bisimilar iff the defender 
has a winning strategy and non-bisimilar iff the attacker has a winning strategy. 

Note that context-free processes (BPA) correspond to the subclass of push- 
down automata (PDA) where the finite control has size 1. Although BPA and 
PDA describe the same class of languages (Chomsky-2), BPA is strictly less 
expressive w.r.t. bisimulation. 



PRS (G,G) 




PAD (S,G) PAN (P,G) 




PDA (S,S) PA (1,G) PN (P,P) 




BPA (1,S) BPP (1,P) 




FS (1,1) 

Fig. 1. A hierarchy of PRS 
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3 Hardness of Weak Bisimulation Problems 

In this section we show lower bounds for problems about weak bisimulation. We 
consider the following two problems: 

Weak bisimilarity of pushdown automata and finite automata 
Instance: A pushdown automaton P and a finite automaton F. 

Question: Pk.F1 

Weak Finiteness of Pushdown Automata 
Instance: A pushdown automaton P. 

Question: Does there exist a finite automaton F s.t. P k F 1 

We show that both these problems are PSPACE-hai'd. The proof is done by 
a reduction from the PS PACE -complete problem if a single tape, linearly space- 
bounded, nondeterministic Turing-machine M accepts a given input w. There is 
a constant k s.t. if M accepts an input w then it has an accepting computation 
that uses only k ■ |w| space. For any such M and w we construct a pushdown 
automaton P s.t. 

— If M accepts w then P is not weakly bisimilar to any finite automaton. 

— If M doesn’t accept w then P is weakly bisimilar to the finite automaton F 
of Figure 0 

The construction of P is as follows: Let 
n := k ■ \w\ 1 and E be the set of tape sym- 
bols of M. Configurations of M are encoded 
as sequences of n symbols of the form v\qv 2 
where V\,V 2 G E* are sequences of tape sym- 
bols of M and g is a state of the finite control 
of M. The sequence v\ are the symbols to the 
left of the head and V 2 are the symbols un- 
der the head and to the right of it. (z;i can 
be empty, but V 2 can’t.) Let po be the initial 
control-state of P and let the stack be initi- 
ally empty. Initially, P is in the phase ‘guess’ where it guesses an arbitrarily long 
sequence Ci#C 2 # . . . #Cm of configurations of M (each of these Ci has length n) 
and stores them on the stack. The pushdown automaton can guess a sequence of 
length n by n times guessing a symbol and storing it on the stack. The number 
of symbols guessed (from 1 to n) is counted in the finite-control of the push- 
down automaton. The number m is not counted in the finite-control, since it 
can be arbitrarily large. The configuration Cm at the bottom of the stack must 
be accepting (i.e., the state q in Cm must be accepting) and the configuration 
Cl at the top must be the initial configuration with the input w and the initial 
control-state of M. All this is done with silent r-actions. At the end of this phase 
P is in the control state p. Then there are two possible transitions: (1) p -A poA 
where the special symbol A ^ E Is written on the stack and the guessing phase 
starts again. (2) p — >■ Pverify where the pushdown automaton enters the new 
phase ‘verify’. 




Fig. 2. The finite automaton F 
with initial state si. 
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In the phase ‘verify’ the pushdown automaton P pops symbols from the stack 
(by action r). At any time in this phase it can (but need not) enter the special 
phase ‘check’. For a ‘check’ it reads three symbols from the stack. These symbols 
are part of some configuration ci. Then it pops n — 2 symbols and then reads 
the three symbols at the same position in the next configuration c^+i (unless the 
bottom of the stack is reached already). In a correct computation step from Ci 
to Ci+i the second triple of symbols depends on the first and on the definition 
of M. If these symbols in the second triple are as they should be in a correct 
computation step of M from Ci to c^+i then the ‘check’ is successful and it goes 
back into the phase ‘verify’. Otherwise the ‘check’ has failed and P is in the 
control-state fail. Here there are two possible transitions: (1) fail p 2 - In the 
control-state p 2 the stack is ignored and the pushdown automaton from then 
on behaves just like the state S 2 in the finite automaton F of Figure 0 (2) 
fail P 3 - In the control-state p^ again the stack is ignored and from then on 
the pushdown automaton behaves just like the state S 3 in the finite automaton 
F of Figure 0 The intuition is that if the sequence of configurations represents 
a correct computation of M then no ‘check’ can fail, i.e., the control-state fail 
cannot be reached. However, if the sequence isn’t a correct computation then 
there must be at least one error somewhere and thus the control-state fail can 
be reached by doing the ‘check’ at the right place. 

So far, all actions have been silent r-actions. The only case where a visible 
action can occur is the following: The pushdown automaton P is in phase ‘verify’ 
or ‘check’ (but not in state fail) and reads the special symbol A from the stack. 
Then it does the visible action ‘a’ and goes to the control-state Pverify If P 
reaches the bottom of the stack while being in phase ‘verify’ or ‘check’ then it 
is in a deadlock. 

Lemma 2. If M accepts the input w then P is not weakly bisimilar to any finite 
automaton. 

Proof. We assume the contrary and derive a contradiction. Assume that there 
is finite automaton F' with k states s.t. P ~ F'. Since M accepts w, there 
exists an accepting computation sequence c = Ci#C 2 # . . . where all Ci are 
configurations of M, ci is the initial configuration of M with input w, Cm is 
accepting and for alH £ { 1 , . . . , to — 1 } Ci — >■ Ci+i is a correct computation step 
of M. 

P can (by a sequence of r-steps) reach the configuration a := Pverify (cA)^+^c. 
Since c is an accepting computation sequence of M, none of the checks can fail. 
Thus a can only do the following sequence of actions: 

We assumed that P rs F' . Thus there must be some state / of F' s.t. a ~ /. 
Since F' has only k states, it follows from the Pumping Lemma for regular 
languages that f and we have a contradiction. □ 



Lemma 3. Let F be the finite automaton from Figure 0 // M doesn’t accept 
the input w then P F. 
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Proof. Since there is no accepting computation of M on w, any reachable con- 
figuration of P belongs to one of the following three sets. 

1. Let Cl be the set of configurations of P where either P is in phase ‘guess’ or 
P is in phase ‘verify’ or ‘check’ s.t. a check can fail before the next symbol 
A is popped from the stack, i.e. the control-state fail can be reached with 
only T- act ions. 

2. Let C2 be the set of configurations of P where either the finite control of P is 
in state P2 or P is in phase ‘verify’ or ‘check’, there is at least one symbol A 
on the stack and no check can fail before the next symbol A is popped from 
the stack, i.e. the control-state fail cannot be reached with only r-actions, 
but possibly after another ‘a’ action. 

3. Let C3 be the set of configurations of P where either the finite control of P 
is in state ps or P is in phase ‘verify’ or ‘check’, there is no symbol A on the 
stack and no check can fail, i.e. the control-state fail cannot be reached. 

The following relation is a weak bisimulation: 

{(aijSi) I Q!i G Cl} U {(0:2,52) I 02 G C2} U {(03,53) I 03 G C3} 

We consider all possible attacks. 

1. Note that no Oi G Ci can do action ‘o’. 

— If the attacker makes a move from a configuration in Ci with control- 
state fail to P2/P3 then the defender responds by a move si — >■ si/s2. 
These are weakly bisimilar to P2/P3 by definition. If the attacker makes 
a move oi A a[ with 01,0} G Ci then the defender responds by doing 
nothing. If the attacker makes a move oi A a} with oi G Ci and 
02 G C2 (this is only possible if there is at least one symbol A on the 
stack) then the defender responds by making a move si A- S2- If the 
attacker makes a move ai a\ with a\ G Ci and 02 G C3 (this is only 
possible if there is no symbol A on the stack) then the defender responds 
by making a move si A S3. 

— If the attacker makes a move si — >■ S2/S3 then the defender makes a 
sequence of r-moves where a ‘check’ fails and goes (via the control-state 
fail) to a configuration with control-state P2jP3- This is weakly bisimilar 
to S2/S3 by definition. 

2. If 02 is a configuration with control-state p2 then this is bisimilar to S2 by 
definition. 

— If the attacker makes a move 02 <^2 with 02 , 0:2 G C2 then the defender 

responds by doing nothing. If the attacker makes a move 02 (this 

is only possible if the symbol A is at the top of the stack) then the 
control-state of is Qverify and a'2 G Ci. Thus the defender can respond 

by S2 4 si. 

— If the attacker makes a move S2 — >■ si then the defender responds as 
follows: First he makes a sequence of r-moves 02 4 a'2 that pops symbols 
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from the stack without doing any ‘check’ until the special symbol A is at 
the top. Then he makes a move By definition the control-state 

of a'2 is qverify and G Ol ■ 

3 . A configuration 03 € C3 can never reach a configuration where it can do 
action ‘a’. The only possible action is r. Thus 03 ~ S3. 

Since the initial configuration of P is in Ci and the initial state of A is si, we 
get PKi F. □ 



Theorem 4. Weak bisimilarity of pushdown automata and finite automata is 
PSPACE-hard, even for the fixed finite automaton F of Figure^ 

Proof. By reduction of the acceptance problem for single tape nondeterministic 
linear space-bounded Turing machines. Let M, w, P and F be defined as above. If 
M accepts w then by Lemma |3 P is not weakly bisimilar to any finite automaton 
and thus P ^ F.li M doesn’t accept w then by Lemma 0 P ps F. □ 



Theorem 5. Weak finiteness of pushdown automata is PSPACE-hard. 

Proof. By reduction of the acceptance problem for single tape nondeterministic 
linear space-bounded Turing machines. Let M, w, P and F be defined as above. If 
M accepts w then by LemmaEI P is not weakly bisimilar to any finite automaton 
and thus not weakly finite. If M doesn’t accept w then by Lemma ~ F and 
thus P is weakly finite. □ 



4 Hardness of Strong Bisimulation Problems 

Strong bisimilarity of pushdown automata and finite automata 
Instance: A pushdown automaton P and a finite automaton F. 

Question: P ^ F 1 

We show that this problem is PSPACE-haxd in general, but polynomial in 
the size of P for every fixed finite automaton F. The PSPACE lower bound is 
shown by a reduction of the PSPACE-complete problem of quantified boolean 
formulae (QBF). Let n G IN and let a;i, . . . ,x„ be boolean variables. W.r. we 
assume that n is even. A literal is either a variable or the negation of a variable. 
A clause is a disjunction of literals. The quantified boolean formula Q is given 
by 

Q ■ — Vxi3x2 ■ • ■ Vxtt, — A ... A 

where the Qi are clauses. The problem is if Q is valid. We reduce this problem 
to the bisimulation problem by constructing a pushdown automaton P and a 
finite automaton F s.t. Q is valid iS P ^ F. 
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Note that, unlike in the previous section, the size of F is not fixed, but linear in 
n. Figure 0 illustrates the construction. 

Now we define the pushdown automaton P. Initially the stack is empty and 
the initial control-state is po. For 1 < j < fc and 1 < Z < n we define Qj{Xi) iff 
Xi makes the clause Qj true and Qj{Xi) iff Xi makes Qj true. The transitions 
of P are as follows: 

P2i P2{i+l)X2^+2X2i+l for 0 < z < n/2 - 1 

P2i — ^ P 2 (i-|-l)-^ 2 i+ 2 -^ 2 i-|-l for 0 < Z < Zz/2 — 1 

P2i p2(i+l)X2^+2X2^+l for 0 < z < n/2 - 1 

P2i p2{i+l)X2^+2X2i+l for 0 < z < n/2 - 1 

P2i r2(i+i) for 0 < z < n/2 - 1 

P2i T2{i+i) for 0 < z < n/2 - 1 

Pn qj for 0 < j <k 

Qo qo 

qjXi qjXi for 1 < j < fc, 1 < Z < zz if Qj{Xi). 

qjXi qj for 1 < j < fc, 1 < Z < n if ~<Qj{Xi). 

qjXi — ^ qjXi for 1 < j < fc, 1 < Z < n if Qj{Xi). 

qjXi qj for 1 < j < fc, 1 < Z < n if ~<Qj{Xi). 



Additionally we define forl<z<n/2— 1 that in the control-state r 2 i the stack 
is ignored and the systems behaves just like t 2 i in the system F of Figure 0 



Lemma 6. If Q is not valid then P F . 

Proof. If Q is not valid then 3xiVx2 . . . 3xn-i^Xn{~'Qi V ... V ~'Qk) and the 
attacker has the following winning strategy: The attacker chooses the values for 
the variables with the odd indices by doing actions Xi or Xi in the finite automa- 
ton F and goes from sq to Sn- The defender can respond in two different ways: 
(1) If the defender goes into a control-state V 2 i for some z then the attacker can 
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Fig. 3. Reducing QBF to strong bisimulation. 

easily win, since r2i behaves like t2i and S2i t2i for every i. (2) If the defen- 
der stays in the ‘p-domain’ of control-states, he is forced to store the attacker’s 
choices for the variables with odd indices on the stack. However, he can make 
his own choices for the variables with even indices and also stores them on the 
stack. Finally, the defender reaches the control-state Pn and the stack contains 
an assignment of values to all n variables. Since Q is not valid, there exists at 
least one Qj with 1 < j < fc that is not satisfied by this assignment. Now the 
attacker changes sides and makes the move Pn Pj in the pushdown automaton 
P. The defender can only respond by making the move A m in the system 
F. Now the pushdown automaton P can do the action ‘c’ only n times, while 
system F in state u can do it infinitely often. Thus the attacker can win. It 
follows that P F. □ 



Lemma 7 . If Q is valid then P ^ F. 

Proof. Let C be a content of the stack and thus a (possibly incomplete) assign- 
ment of values to variables. Let Qi{C) be true iff C makes clause Qi true. Let 
Q{C) := Ai<i<fc Qi(C)- Let QX{C) be true iff C can be completed to a C s.t. 
Q{C). If Q is valid then the following relation is a strong bisimulation. 

{{p2iC,S2i) |0 < z < n /2 A QX{C)} U {(^ 21 ^^ 24 ) | 1 < * < n /2 A ^QX{C)} U 
{{r2iC,t2i) 1 1 < z < n/ 2 } U {{qjC,u) 1 1 < j < A: A Qj{C)} U {(goA^)} U 
{{qjC,Wi) |l<j<AA 0<z<rzA ->Qj{C) A length{C) = z} 

Since (poe> so) is in this relation, we get P ~ F. □ 



Theorem 8 . Strong bisimilaritv of pushdown automata and finite automata is 
PSP ACE -hard. 

Proof. Directly from Lemma 0 and Lemma 0 □ 



Corollary 9 . Strong hisimilarity of pushdown automata is PSPACE-hard. 
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Note that Theorem Elis not a corollary of Theorem 0 For weak bisimilarity 
the hardness result holds even for the small fixed finite automaton of Figure |2 
However, strong bisimilarity of a pushdown automaton P and a finite automaton 
F is polynomial in the size of P for every fixed F . 

Theorem 10. Let F be a fixed finite automaton. For every pushdown automaton 
P the problem if P ^ F requires only polynomial time in the size of P. 

Proof. Using the construction from m one can reduce the problem P F to 
a model checking problem in the temporal logic EF (a fragment of CTL). One 
can effectively construct Hennessy-Milner Logic formulae and F that depend 
only on F s.t. 

P F 4=^ (P ^ A (P h ^EF F) 

where the modal operator EF denotes reachability. Let n be the size of (the 
description of) P and m the maximum of the nesting-depth of F and F. (The 
total size of F and F can be 0(2'").) Let P' be a state that is reachable from 
P. It depends only on the control state of P and P' and on the first m stack 
symbols of P and P' if they satisfy F and F, respectively. There are only n 
different possibilities for the control state and n™ different possibilities for the 
first m stack symbols. For each of these configurations we check if it 

satisfies F or F. Each of those checks can be done in 0(n"^) time. Also for each 
a of these configurations we check if P can reach a configuration a/3 for 

some /3. (/3 represents the stack contents below the first m stack symbols. It does 
not matter for F and F.) Each of those (generalized) reachability-checks can be 
done in Ofn^mf) time |^. Therefore the whole property above can be checked 
in time. Thus the problem is polynomial in n, the size of P, but 

exponential in m. (To be precise, m depends only on P and can be made linear 
in the number of states in P IH-) □ 

Now we consider the strong finiteness problem. 

Strong Finiteness of Pushdown Automata 
Instance: A pushdown automaton P. 

Question: Does there exist a finite automaton P s.t. P ^ P ? 

We show that this problem is PS'PA CP-hard by a reduction of QBF. Let Q, 
P and P be defined just as before in the hardness proof of strong bisimilarity. 
As shown before, Q is valid iff P ~ P. We now construct a pushdown automaton 
P' s.t. P' is finite w.r.t. strong bisimilarity iff P ^ P. The initial configuration 
of P' is p' Z. The transition rules are 
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Note that if P' is in control-state po or sq then it behaves like P and F, respec- 
tively. 

Lemma 11. If P F then P' is infinite w.r.t. strong bisimilarity. 

Proof. There are infinitely many non-bisimilar reachable states q'C^Z for all 
i G N. It suffices to show that q'C^Z / q'C^Z for i > j. The attacker has 
the following winning strategy: He does action b' exactly j times (the defender 
can respond in only one way) and the new state in the bisimulation game is 
{q'C^~^Z, q'Z). Then the attacker does action c' and after the defender’s response 
the new state is {poC^~^~^ Z, sq )- Since P F, the attacker can win. □ 



Lemma 12. If P ^ F then P' is finite w.r.t. strong bisimilarity. 

Proof. Let the finite automaton F' with initial state s' be defined by 

s' s' 
s' ^t' 
t' ^t' 

J./ c' 

t So 

where sq is the initial state of F. If P ~ F then p'C'Z ~ s', q'C^ Z ^ t', 
PqC^Z ^ So and sq so and thus P' ^ F' . □ 



Theorem 13. Strong finiteness of pushdown automata is PSPACE-hard. 

Proof. It follows from Lemmas ItiiZI ITTI and that Q is satisfiable iff F ~ F iff 
P' is finite w.r.t. strong bisimilarity. □ 

It might seem that Theorem is a corollary of Theorem ESI However, a 
careful inspection reveals a slight difference. The proof of Theorem 0 shows that 
the question if, given a pushdown automaton F, “Is F weakly bisimilar to any 
finite automaton with at most 3 states ?” is PSPACE-h.a,id. The same question 
for strong bisimilarity is polynomial, because of Theorem E3 (These results still 
hold if the number 3 in the question above is replaced by any other integer fc > 3. 
For weak bisimilarity the question is PSPACE-ha,id in the size of F. For strong 
bisimilarity it is polynomial in the size of F and exponential in k.) So, while 
in general the finiteness problem for a pushdown automaton F is PSPACE- 
hard for both weak and strong bisimilarity, the modified question “Is F finite 
and small ?” is F5FA FF-hard for weak bisimilarity, but polynomial for strong 
bisimilarity. To conclude, finiteness w.r.t. weak bisimilarity is hard in a slightly 
stronger sense. 



On the Complexity of Bisimulation Problems for Pushdown Automata 485 



5 Conclusion 

We have shown that all bisimulation problems for pushdown automata are at 
least PSPACE-Yiaid. Thus no bisimulation problem for pushdown automata is 
polynomial (unless P SPACE = V). It is interesting to compare these results 
with the results for context-free processes (BPA), which describe exactly the 
same class of languages (Chomsky-2) . Strong and weak bisimilarity of BPA and 
finite automata can be decided in polynomial time m- This shows that there is 
a significant difference between pushdown automata and context-free processes 
(BPA) as far as ‘branching-time equivalences’ like strong and weak bisimulation 
are concerned. Intuitively, the reason for this is that, due to their finite control, 
pushdown automata have a limited power of self-test that context-free processes 
lack. 

The problem of bisimulation equivalence is related to the problem of language 
equivalence for deterministic systems, e.g., the problem of language equivalence 
for deterministic pushdown automata (dPDA), which has been shown to be de- 
cidable in m- However, the relationship is more complex than it seems, because 
of the presence of e-transitions in PDAs. ‘Real-time’ PDAs are PDAs without e- 
transitions. We denote them by rPDA. We denote real-time deterministic PDAs 
as rdPDA. We can distinguish five problems. 

1. For rdPDA, strong bisimilarity and trace-language equivalence coincide. (The 
problem of trace-language equivalence can easily be reduced to terminal- 
language equivalence on rdPDA.) This problem is also equivalent to strong 
bisimilarity of dPDA, because the e-transitions don’t matter for strong bi- 
similarity. Language equivalence on rdPDA has been shown to be decidable 
in Neither an upper complexity bound nor a lower complexity bound is 
known. 

2. Strong bisimilarity for PDA and rPDA. These problems are equivalent, be- 
cause the e-transitions don’t matter for strong bisimilarity. Decidability of 
strong bisimilarity for PDA has been shown in m No upper complexity 
bound is known. Theorem |H1 gives a PSPACE lower bound. 

3. Language equivalence of dPDA. This is equivalent to weak bisimilarity of 
dPDA, if one renames the e-transitions to r-transitions. The problem is 
decidable by m- Neither an upper complexity bound nor a lower complexity 
bound is known. 

4. Weak bisimilarity for PDA. It is an open question if this problem is decidable. 
A PSPACE lower bound has been shown in m (even for BPA) . Theorem E] 
shows that even the asymmetric problem of weak bisimilarity of a PDA and 
a (small fixed) finite automaton is PSPACE-ha,rd. 

5. Language equivalence for PDA and rPDA. These problems are inter-reducible 
and undecidable by EH. 

Figure 0 shows the relationships between these five problems. The hardness 
results of this paper hold only for bisimilarity of nondeterministic PDA (i.e., 
problems number 2 and 4) and thus they don’t yield a lower bound for the 
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problem of language equivalence of dPDA (problem 
number 3). In particular, it is easy to see that lan- 
guage equivalence of a dPDA and a deterministic 
finite automaton is polynomial (unlike bisimilarity 
for nondeterministic systems; see TheoremEI). It still 
cannot be ruled out that a polynomial algorithm for 
language equivalence of dPDA might exist. 

Two lower bounds for bisimulation problems ab- 
out Petri nets have not been mentioned explicitly 
in the literature so far. They concern the problems 
of strong bisimilarity of a Petri net and a finite au- 
tomaton and finiteness of a Petri net w.r.t. strong 
bisimulation. It can easily be shown that these pro- 
blems are EXPSPACE-h.aid by a reduction of the 
problem if a given place in a Petri net can ever become marked. (This problem is 
polynomially equivalent to the reachability problem for Petri nets m and thus 
EXPSPACE-da.vd HH].) 

Tabled summarizes known results about the complexity of bisimulation pro- 
blems for several classes of infinite-state systems. The different columns show the 
results about the following problems: strong bisimilarity with finite automata, 
strong bisimilarity of two infinite-state systems, weak bisimilarity with finite au- 
tomata and weak bisimilarity of two infinite-state systems. New results are in 
boldface. 




Fig. 4: Bisimulation vs. lan- 
guages 
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Table 121 summarizes results about the problems of strong and weak finiteness. 
New results are in boldface. 
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Some more results are known about the restricted subclasses of these systems 
that satisfy the ‘normedness condition’ (e.g. unraEi) . Normedness means that 
from every reachable state there is a terminating computation. This condition 
makes many bisimulation problems much easier, e.g., strong bisimilarity of nor- 
med BPP is decidable in polynomial time m, while it is at least co-AfT^-hard in 
the general case PI. Also for normed systems finiteness w.r.t. strong bisimilarity 
coincides with boundedness m, while this doesn’t hold in the general case. 



Acknowledgment: Thanks to Colin Stirling for helpful discussions. 
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Abstract. Partial continuations are control operators in functional pro- 
gramming such that a function-like object is abstracted from a part of 
the rest of computation, rather than the whole rest of computation. Se- 
veral different formulations of partial continuations have been proposed 
by Felleisen, Danvy&Filinski, Hieb et al, and others, but as far as we 
know, no one ever studied logic for partial continuations, nor proposed 
a typed calculus of partial continuations which corresponds to a logi- 
cal system through the Curry-Howard isomorphism. This paper gives 
a simple type-theoretic formulation of a form of partial continuations 
(which we call delimited continuations), and study its properties. Our 
calculus does reflect the intended operational semantics, and enjoys nice 
properties such as subject reduction and confluence. By restricting the 
type of delimiters to be atomic, we obtain the normal form property. We 
also show a few examples. 



1 Introduction 

The mechanism of first-class continuations (the call/ cc-mechanism in Scheme 
HJ) is a quite powerful control facility, and is equipped with many modern pro- 
gramming languages such as Standard ML. Felleisen et al 0 established a theory 
for first-class continuations by which we can reason about properties of programs 
with first-class continuations. 

Partial continuation is a refinement of first-class continuation in that a conti- 
nuation object is abstracted from a part of the rest of computation, rather than 
the whole rest of computation. Felleisen |E| introduced a pair of operators ^ and 
T to represent partial continuations. The former delimits the range of continua- 
tions which will be later invoked by the latter operator. The other distinguished 
feature of partial continuations is that, its invocation does not abort the current 
continuation, contrary to the first-class continuations. Hence the abstracted ob- 
jects are normal functions which can be composed with other functions. As Fell- 
eisen showed, the concept of partial continuation is useful in practice; interesting 
examples can be implemented more concisely and efficiently using partial con- 
tinuations. After then, several different operators for partial continuations have 
been proposed by Queinnec and Serpette m, Gunter CD, Danvy and Filinski 
0 and others. 
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If we want to give a logical view to partial continuations through the Curry- 
Howard isomorphism, a fundamental problem arises in these approaches. Na- 
mely, the scope of Felleisen’s # operator (and its counterpart in other resear- 
cher’s calculi) is dynamic; for each T operator, the matching # is determined 
at the time of evaluation. Consequently, we cannot represent the scope of ^ 
by a simple variable-binding mechanism, thus cannot construct a simple logic 
corresponding to their operators. One exception is the subcontinuation by Hieb 
et al m whose operator has static scope. However, their operator may generate 
run-time errors and we cannot develop a type safe calculus for subcontinuations. 

In this paper, we give a simple typed calculus for partial continuations which 
have static scope. Since the scope of a partial continuation is lexically determined 
by the corresponding delimiter, our variant is called a delimited continuation^ 
Our calculus is designed to satisfy the following conditions: (1) it is a statically 
typed system which corresponds to, through the Curry-Howard isomorphism, a 
consistent logical system, (2) it is type safe in the sense that it enjoys the subject 
reduction property and reductions never get stuck, and (3) its reduction rules 
are confluent, and compatible with any contexts. By (1), our calculus can be 
viewed as a logical system. Indeed, it corresponds to classical logic. By (2) and 
(3), we have an equational theory for programs with partial continuations. 

In order to make our type-theoretic analysis easier, we represent (the coun- 
terpart of) T by two operators, an invoker of partial continuations and a throw 
operation, and give reduction rules. Our reduction rules reflect the intended ope- 
rational semantics, and enjoy the above properties (l)-(3). We also show that the 
subcalculus with the delimiter and the throw operation (without the invoker) 
corresponds to the classical catch/throw calculus in mm , while the subcalculus 
with the delimiter and the invoker (without the throw operation) corresponds 
to intuitionistic calculus. 

The rest of this paper is organized as follows. Section Qgives the background 
and motivation of our formulation. Sections nn and |5| give the type system, 
the operational semantics, and the reduction rules of our calculus, respectively. 
Section El proves theoretical properties such as subject reduction and confluence. 
Section 0 shows that our calculus corresponds to classical logic through the 
Curry-Howard isomorphism. Section 0 gives the conclusion. 

2 Formulating Partial Continuations in Type Theory 

We start our analysis with Felleisen’s formulation. Felleisen jS| proposed a form 
of partial partial continuations, which has the following reduction rule: 

#E[TV] V{Xx.E[x]) 

where E is an evaluation context, F is a value, # is a delimiter which restricts 
the range of partial continuations, and T is an operator which creates a partial- 
continuation object up to the delimiter. In the above term, the created partial 



^ Olivier Danvy coined this term. 
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continuation object \x.E\x] is applied to the term V. Note that, this object 
is a simple function, and not abortive in the sense that, when some value is 
applied, it does not throw away the current continuation. These two features 
are characteristic points for partial continuations compared with first-class (full) 
continuations. 

Felleisen’s partial continuations are refinement of full continuations; theoreti- 
cally partial continuations behaves well, and are useful in practice. However, if we 
want to construct a typed calculus for partial continuations which corresponds 
to a sound logical system, it is problematic. 

The problem is that, the scope of the ^-operator is dynamic, for instance, 
{Xx.#{xN)){Xy.TM) reduces to #{{Xy.TM)N), thus T gets captured by # 
through this reduction. In this system, we cannot intuitively understand the 
meaning of programs. Consider the term Xx.^{x{EM)). We are tempted to 
consider this # and T are connected. But if we apply Xy.C[4f^y] to it, the above 
T is delimited by the latter It seems impossible to have a Curry-Howard 
isomorphism of this kind of calculi and ordinary logical systems, since the cor- 
respondence between the ^-operator and the .7^-operator cannot be represented 
by the variable-binding mechanism (which is lexical). 

Danvy and Filinski proposed another formulation of partial continuations | 2 ] 
0. Their operators reset and shift differ from Felleisen’s ones in that the 
created partial continuation object is again delimited. This change has a better 
effect for formalizing partial continuations, and in fact, they successfully gave 
a CPS-translation in an ordinary, functional style. However, still the scope of 
their reset operator (denoted by ^ in the above reduction) is also dynamic, 
hence the same problem applies if one want to formalize their operator in a type 
theory which admits the Curry-Howard isomorphism. 

Hieb, Dybvig and Anderson [ I ,'S) proposed subcontinuations which essentially 
have the following reduction rule: 

^ V{Xx.i^i{E[x])) 

where I is a label attached to the operators. The notable point in their approach 
is that (i) they can treat multiple labels so that the operator T can specify 
the matching delimiter, and (ii) the binding mechanism of the label I is the 
ordinary variable binding so that it is static. These two points are big benefits 
for logical viewpoint. However, the .7^-operator may become unbound through 
the reduction, causing a run-time type-error (when V contains an occurrence of 
the label I, then the reduced term may contain I free). Also they did not study 
a typed calculus for their operator. 

Our aim is to develop a theory for partial continuations which is logically 
well-founded. Since existing partial continuation operators are not satisfactory 
in the sense of logic, we shall change the operational behavior of existing partial 
continuations to obtain a logically well-behaved system. We should be careful for 
this change of semantics so that interesting programming examples with existing 
partial continuations can be expressible in our calculus. In particular, we should 
try to make this change as little as possible. 



492 Y. Kameyama 



As a conclusion we decided to formalize an improved version of Hieb et al’s 
operator so that no run-time type error may occur. In order to make our analysis 
easier, we shall not directly formalize the .7^-operator, but instead we treat two 
operators calldc (stands for “call with partial continuation”) and throw, which 
essentially have the following reductions: 

#aE[calldCaV] —>■ #aE[V{Xx.#aE[x])] 

#Q,i?[throw„F] — #aV 

where is a value and E is an evaluation context defined later. The point is 
that, in the first rule we attach one more delimiter to enclose the resulting term. 
As is shown later, our (counterpart of the) .7^-operator can be represented by 
calldc and throw. 



3 The Type System 

We now define the type system of our calculus. Actually we are defining two 
calculi Adc and Adc is the full calculus, and by putting restriction on 

terms we obtain Adc°™“^- 

Types and terms are defined by the following grammar where K is an atomic 
type, is a constant of atomic type K, and x and a are variable^!. We assume 
that Unit is an atomic type, and • is a constant (its single element) of type 

Unit. 



A,B ■.:= K \ B \ 

M,N :■= X \ c \ Xx.M \ MN 

I #q,M I calldCaM I throwQ,M 

The type ^A is the type of tags of control operators for type A. Note that -i is 
a primitive type constructor, and not a defined symbol. In Abc°“'“, we restrict 
that the types of tags be atomic, namely in formulating -<A, the type A must 
be atomic. Adc has no restriction. 

The first line of terms are usual A-terms. The second line consists of control 
operators. The term delimits the scope of partial continuation which may 
be created inside M (with the tag a). The term calldCaM creates a partial- 
continuation (delimited continuation) object like .7^-operator. Our calculi also 
have an abortive operator throWaM which finishes the current continuation up 
to the corresponding delimiter. As a notational convention, #q,Mi . . . should 
be parsed as #a{Mi . . . M„). 

A type environment is a finite set of the form x : A where no variable appears 
more than once. For instance {cc : A, a : ~>{B — >■ C)} is a type environment. 

^ There is no syntactic distinction of x and a, but we use x for ordinary variable and 
a for tags. 
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We use r for a type environment. A judgement is in the form F \- M : A. The 
typing rules to derive a judgement are displayed in Table ^ As usual, if two 
type environments Fi and T 2 are not compatible, then Fi U /jj is not defined. If 
r h M : A is derived, we say M is a (typable) term of type A under the type 
environment F. We sometimes omit type environments if they are apparent. 



Table 1. Typing Rules 



rvj{x-. A} ^ X-. A F\- ■. K 

Fu{x : A} \- M : B Fi\- M -A^ B F2 \~ N ■. A 
r h Xx.M :A^B A U Ta h MN : B 

r U {a : -iB} M -.B 
r h #cM : B 

r \- M ■. ((Unit -y-A)— ^B )— F \- M : B 
F U {a : ~<B} h calldCo-M : A T U {a : ~<B} h throwoM : C 



In Tabled The first three rules are as usual. The fourth rule is for the control 
delimiter, and as we explained, it discharges the assumption a : -'B, namely, it 
binds the variable a. 

The fifth and sixth rules are for calldc and throw. The role of calldc is 
almost the same as T, but its type is slightly different, since (i) we split the role 
of F into two operators calldc and throw, and (ii) by controlling the evaluation 
order suitably, we used a thunk of type Unit — ^ A in the rule for calldc. We can 
delay the evaluation of a term M of type A by encapsulating it as .M. 

This technique is useful when we argue the correspondence between the ML-like 
operational semantics and this formulation in Section0 Since throWcM aborts 
the current continuation and jumps to the corresponding control-delimiter, its 
type can be any type. The variable a in these two operators is a free occurrence. 

The variables x and a are bound by Xx.M and #aM, respectively, and 
FV{M) denotes the set of free variables in M. As usual, we identify two terms 
which differ in bound variables only. The term M\N/x] denotes the term M 
substituted N for x. We also say that a term M is closed if FV (M) = {}. 

Note that a is an ordinary variable, so we may abstract a like Xa.M. This 
has practical benefit, since we often want to define functions with free tags, and 
bind them in another function. Let us show an example in Scheme- like language. 
We often want to write the following program: 
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(define (foo x) 

(callpc alpha . . . ) . . . ) 

(define (goo y) 

(catch alpha (foo y))) 

In our calculus, such a program is not allowed (since our control operators have 
static scope). However, we can represent the above program by abstracting the 
variable alpha in foo. The resulting program is: 

(define (foo x beta) 

(callpc beta . . . ) . . . ) 

(define (goo y) 

(catch alpha (foo y alpha))) 

By this technique, we can (partly) recover the expressiveness, which was lost by 
changing the scope of delimiters from dynamic one to static one. However, not 
all expressible programs with dynamic scope operators can be expressed in our 
calculus. We think that it is a trade-off of theory and practice. 

4 ML-Like Operational Semantics 

In this section we give an operational semantics of our calculi in the style of 
Felleisen et al |S| . Note that this operational semantics may cause run-time type 
errors, therefore its direct formalization (in a type safe way) is not possible. We 
nevertheless state the operational semantics here to clarify our intented seman- 
tics. Later we show that the equality in corresponds to this operational 

semantics. 

We first define values (denoted by H), redexes (denoted by R), and evaluation 
contexts (denoted by E) as follows: 



V ::= X \ c \ Xx.M 

R ::= {\x.M)V \ #aV \ calldcal^ | throw^H 
E ::= [ ] I EM | | calldCo.if | throWcif | #aE 

Then we have that, any closed term M is either a value, or there uniquely 
exists a pair of a redex R and an evaluation context E such that M = E[R]. 

We have the following 1-step reduction rules where E^ is an evaluation con- 
text which does not bind a. 



E[{\x.M)V] '^1 E[M[V/x\] 

E[#^V] E[V] 

E[#aEo[calldCoy]] [k( Alt. #„ifo [«•])]] 
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E[#aEo[throviaV]] E[V] 

-Eolcalldc^y] error 

-Eoi^hroWaT^] error 

Since the decomposition of a term is unique, the above set of reduction rules 
induces a deterministic evaluation strategy. 

Run-time errors may happen for Adc even if a reduction begins with a closed 
term; in the second and the fourth rules, the value V may be Xx.M and M may 
contain free occurrences of a. For instance: 

(#Q,Aa;.calldcQa;)R (Ax.calldCo,a:)R calldCo-R error 

On the contrary, no run-time errors occur in Apg°™''^, since the value V in #aV 
and throwQ,y must be either a variable or a constant. 

5 Small-Step Reductions 

We want to set up an equational theory to reason about the programs in 
and Adc. In this respect the reduction step given in the previous section is too 
big. This section gives finer reduction rules which are easier to study. 

First we define a singular context Eg as follows: 

Eg ::= [ ]M | ] | calldCo,[ ] | throwQ.[ ] 

Next we define the notion of one-step reduction denoted by — l-i, which is 
defined as the compatible closure of the following reduction rules. 

We split the reduction rules into two groups. The first group of reduction 
rules are as follows: 

{Xx.M)V M[V/x] (1) 

#„M M (if a ^ FV{M)) (2) 

#aM[a/l3] (3) 

The first one is the usual call-by-value /3-reduction. The second one means that 
if there are no control operators with tag a, then the delimiter is useless. The 
last one means that, if two delimiters are set in the same place, they can be 
unified. 

The second group of reduction rules are as follows: 



#Q,(calldCc,y) —>-1 #a(V(Xu.#aU*)) (4) 

#Q,(throWc,y) — >-1 #aV (5) 

ifs[calldcQ,R] — :>i calldCa{Xx.Es[V{Xy.#ax{Xz.Es[y]))]) (6) 

#/ 3 (calldcQ,Aa;.M) — calldCaAa;.#/ 3 M (if a ^ f3) (7) 

ifs[throwQ,R] —^1 throw^R (8) 
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In these reductions we assume x,y,z are fresh variables. 

The first two rules express reductions with an empty partial continuation. 
The third and fourth rules are one-step reductions for calldc and throw. 

Although the righthand side of the third rule (Rule (6)) may look quite 
complex, it is similar to Felleisen’s one-step reduction rule for first-class conti- 
nuations. A crucial point of this reduction rule is that, in the righthand side, the 
partial continuation object is delimited by a newly introduced delimiter so 
the occurrences of calldc^ in Eg are bound by this new If we do not have 
a new delimiter in the reduct, we cannot simulate many interesting reductions 
which were written with partial continuations of dynamic scope. 

Let — >■ be a reflexive, transitive closure of — )>i, and = be the least equivalence 
relation which contains — >■. 



Representing the iF-like operator 

The calldc operator does not discard the current continuation. But we can 
define an operator for creating partial continuation objects, which discards the 
current continuation. Let us define control as follows: 

control^M = calldCo,(Aa;.throwQ,Ma:) 

Then this operator has the following reduction in the ML-like operational se- 
mantics as desired. 



E[#aEo[cOILtTOlaV]] E[#aV{Xu.#aEo[u»])] 

The control operator is closer to Felleisen’s F-operator and Danvy and 
Filinski’s shift operator. 



A Small Example 

The following example was given by Danvy and Filinski 0. In order to express 
this example, we assume that -I- and its reductions were added to our calculus. 



1 -I- #a(10 -I- (Aa:.controla(Afc.A:(/c(a;))))100) 

->■1-1- #a(10 + controlo,(AA:.fc(fc(100)))) 

= 1-1- #a(10 H- calldCa(Aj/.throWa(Afc.A:(A:(100)))j/)) 
->■1-1- #a(10 -I- (A?/.throwQ,(?/(?/(100))))(Ait.#al0 -I- u)) 
->■1-1- #Q,throWo,((Au.lO -I- m)((Aw.10 -I- m)(100))) 

^ l-h#al20 
^ 121 
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A Note on Control Operators with Dynamic Scope 

As we explained, our control operators have static scopes. If we would change 
them to have dynamic scopes in Adc, we would have a non-terminating reduction 
as follows. 

Let P be Ax.(#a(Ay.z)(xw))x, and Q be AM.throw„P. Then, PQ has type 
(C —>■ A) —>■ B under the type environment {z : (C —>■ A) —>■ B, w : C}. 
If the delimiter has dynamic scope, the reduction sequence from PQ does not 
terminate as follows: 



PQ {#a{{Ay.z){Qw)))Q 

(#a((Aj/. 2 )(throw„P)))Q 
— > (#Q,throWc,P)(5 

— (#aP)Q 
PQ 

We do not have this reduction sequence with static scope operators, since we 
must rename a in P before substituting Q for x in P. 



PQ -)> {#i3{{Ay.z){Qw)))Q 

{#fs{{Ay.z){thro-aaP)))Q 

— (#,3throW(jP)(5 
— > (throWo,P)(5 

-T throWctP 

The term throWcP cannot be reduced any further. 

6 Properties of our Calculus 

This section gives properties of Apc°™“^ and Adc. 

First of all, the reductions are closed under substitution. 

Theorem 1. If M ^ M' and N — >■ N' , then M[N/x] — >■ M'[N'/x]. 

This is obvious since our reduction rules are compatible with arbitrary con- 
texts. Note that the calculus for the full continuations contains so called a compu- 
tation rule which is not compatible with contexts. For instance CM — >■ M{Ax.Ax) 
is only applicable for the top-level context where C is Felleisen’s control operator, 
and A is the abort operator. 

The subject reduction property expresses the type safety. 

Theorem 2 (Subject Reduction Property). If P \- M : A and M ^ N , 

then we have that P \- N ■. A and FV{M) C FV{N). 
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Proof. We only verify the most complex reduction rule (6). Suppose that Eg 
has type C, and its hole has type A. The lefthand-side term is typed as follows 
(omitting the type environment for readability): 



M : ((Unit A) ^ B) ^ A 
calldCcM : A 
Eg[calldCaM] : C 

Then, the righthand-side term is typed as follows: 

y : Unit A • : Unit 

y ■■ A 

Es [y] : C 

X : (Unit C) ^ B Xz.Es[y] : Unit — > C 
x{Xz.Eg[y]) : B 

: #ax{Xz.Es[y]) : B 

M : ((Unit A) ^ B) ^ A Xy.#ax{Xz.Eg[y\) : (Unit A) ^ B 
M{XyMgx{Xz.Eg[y\)) : A 

EslM(Xy.#aX(Xz.Eslym]))] : C 

Xx.Es[M{XyMaX{Xz.Es[y\))\ : ((Unit C) ^ B) ^ C 
calldCo,(Ax.ifs[./W(A2/.#Qa;(Az.£'s[j/*]))]) : C 

We also have that the set of free variables are the same. 

Other cases are proved easily. □ 

We then show that Adc and Adc°™“^ are confluent by using Takahashi’s parallel 
reduction method m in conjunction with Hardin’s interpretation method. 

Theorem 3. The calculi Adc and Adc™”'^ are confluent. 

Proof. We first define a d-normal form d{M) of a term M as the term M 
where the reduction (3) (unification of delimiters) is applied as many times as 
possible, namely, contracting successive application of delimiters. Apparently, 
for each term, its d-normal form is unique up to renaming of bound variables. 
We then define the parallel reduction => on terms as follows: 

— X ^ X, and c ^ c. 

— If Mj => M' for i = 1,2, then Xx.Mi => Xx.M[, M 1 M 2 => 
calldCaMi calldCaM(, throw^Mi => throWaM{, and #aMi #qM{. 

— If M => M', and V V , then {Xx.M)V M'\V' /x\. 

— If M M' and a ^ FV{M), then #q,M M' . 

— If P => V, then #aCalldCaP #aV'{Xu.#aU»). 

— If P => P', then #„throWaP #aV'. 

— If P P', Eg => A', then As[calldCcP] calldCc(Aa:.if([P'(Ay.#ax(A 

^■E'glyjm- 
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— If V V' , then _Es[throWo,t4] thrown y'. 

— If M =l> M', then #^calldCo,Aa:.M ^ calldcQ,Ax.#/ 3 -M'. 

— If M =l> M', then M d{M'). 

Next, we define the term M* for each term M as follows. In the following 
definition, if more than one clause match the term M , then we always take the 
first matching clause as the definition of M*. 

— X* = X, and c* = c. 

~ (Xx.M)* = Xx.M*, 

— l{Xx.M)V)* = d{M*[V*/x]). 

— (ifs[calldcQt4])* = d(calldcQ,Ax.if*[F*(A?/.#Q-a;(Az.if*[j/*]))]). 

— (ifs[throwQy])* = throWalC*. 

~ (MTV)* = M*N*. 

— (calldCctM)* = calldCcM*, and (throWo-M)* = throWo,M*. 

~ (#aM)* = M* if a ^ FV{M). 

— (#a#/3M)* = (#^M)*[a//3], 

— (#c,calldcQ,IC)* = #a(V*(Xu.u*)). 

— (#cthroWc,y)* = d{#aV*). 

— {#f^calldCaXx.M)* = d{calldCaXx.#f3M*). 

— = d{#aM*). 

Then by case-analysis, we have that if M ^ N then N M* , which im- 
plies the diamond property of =1>. In this proof, the only problematic case is 
#-,,#/ 3 calldCo,Ax.M =4> #.yCalldc„Aa:.#^M, but it can be probed by some calcu- 
lation. 

We also have, M ^ N implies M ^ N and then — >■ is confluent. 

Note that these arguments apply to both Agc°™“^ and Adc. □ 

Subject reduction and confluence are most basic properties of the typed cal- 
culi, but by restricting the tag types to be atomic, we have a few more desirable 
properties. 

Theorem 4 (Normal Form Property). Let M he a closed normal term in 
^atomic ^ T/ien M is either a constant or in the form of Xx.M' . 

Proof. We say a term M is half-closed if FV{M) = {xi, • • • , a;„} and the 
type of Xi is -•Ai for 1 < i < n. We can show by induction that, a half-closed 
normal term M is in the following forms: 

Xi, c, Xx.M, calldCcF, or throWaF 

The point here is that the type is not defined as A dT, in which case xtc 
may be a half-closed normal term. 

It follows that, if M is a closed normal term, it is a constant or a A-abstract. 

□ 

The normal form property together with the subject reduction property en- 
sures the type soundness for Adc°™''^- 
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Unfortunately, the normal form property does not hold for Adc; there is a 
closed normal term of the form #a\x.M in Adc- To obtain the property in Adc, 
we should add reduction rules such as (#qUi)V 2 — #aU/ where V{ is obtained 
by appropriate substitution. However this single reduction rule does not suffice, 
and we should add more and more. We believe that, even under the restriction 
of Adc°™‘'^, we can express many programming examples, since we usually do not 
want to place delimiters for function types. 

We finally show that our small step reductions do characterize the operational 
semantics given earlier. 

Theorem 5. Let M and N be closed terms. If M ^ N, then M = N in Adc°™''^- 

Proof. The theorem is proved by induction on the length of evaluation. The 
base case is trivial. We list here only mojor cases for the induction step. 

Case-1. E[{Xx.M)V] E[M[V/x]]. 

The same reduction is included in — >-i. 

Case-2. E[#aV] E[V] where a ^ FV{V). 

Since we restricted the type of a be of the form -<K where K is atomic, the 
term V must be a constant of that type. We have #qC — >-i c in Adc°™''^- 

Case-3. if[#aifo[calldcQU]] E[#aEQ\V{\u.#aEQ[u»\)]]. 

We use one more induction on the structure of Eq to prove this case. 

Case-3-1. If ifo = [ ] then, if [#c,calldCc,U] — >-i E[#aV{Xu.#aU»)], which is 
E[#aE'[V (Au.#qM*)]]. 

Case-3-2. If Eq = F[Es] where F is an evaluation context and Eg is a singular 
context, then we have 

#Q,F[ifs[calldcQ,U]] — >-1 #aF[ca.lldCaXx.Es[V{Xy.#ax{Xz.Es[y]))]] 

= #a-F[(Ax.Ss[U(A7/.#„a;(Az.£;s[j/*]))])(AM.#aF[u*])] 
(by induction hypothesis) 
#aF[Es[V{Xy.#a{Xu.#/ 3 F'[u»]){Xz.Es[y]))]] 

^ #^F[Eg[V{Xy.#^#pF'[Eg[y]])]] 

^ #^F[Eg[V{Xy.#^E[Eg[y]])]] 

= #aEo[V{Xu.#aEo[u»])] 

where F' is the result of substituting f3 for free occurrences of a in i^ to avoid 
conflict of bound variables. 

Case-3-3. If Eg = E[#)s[ ]] where E is an evaluation context, then we have 
(assuming V is Xx.M) 

if [#Q,F[#^calldcQAx.M]] — >-i E[#aE[c&YLdCaXx.#i}M]] 

= E[#aF[{Xx.#j 3 M){Xu.#aE[u»])]\ 

(by induction hypothesis) 

— >■ if [#QF[#/3M[Au.#aF['u*]/a:]]] 
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On the other hand, we have 



E[#aF[#i3{Xx.M){\u.#aF[#i3U*\)\\ — F[#aF[#f){\x.M){Xu.#aF[u*\)\] 

—>■1 F[#aF[#i3M[Xu.#aF[u»\/ x]]\ 

So we have the equality. When V is not in the form Xx.M, namely, it is a variable 
or a constant, then the proof is easier since there are no free occurrences of the 
tag p. 

Case-4. F[#aEo[throWaV]] F[V] 

Since does not allow A-abstraction for the above V, V must be either 

a variable or a constant. We then prove this case similarly to the Case-3. If Fq is 
composed by a delimiter, namely, Fq = #/ 3 [ ], then we use the fact that V does 
not contain /3 free. We also have if [#aif'[throw„y]] — >■ F[#aV], and #aV — V. 
Other cases are easy. □ 

By this theorem and the confluence of Adc°™‘'^, we have the following corollary, 
which means our reduction rules really reflect the intended operational seman- 
tics. 

Corollary 1 (Correspondence of ^ and — ?> in Anc™"^)- Let M be a closed 
term and V be a value. If M '^V , then M ^ V in Apg°™"^. 



7 A Logical View 

The Curry-Howard isomorphism relates typed lambda calculi and intuitionistic 
logical systems. As Griffin and other researchers showed, the isomorphism can 
be extended to the relationship between typed lambda calculi with sequential 
control operators and classical logic m- In this section, we show that 
and Adc also correspond to classical logic. 



and Adc are Classical Logic 

We assume that _L is included as an atomic type in Adc°“‘'^- (If it is not the case, 
choose any atomic type as _L, since we do not use the _L-elimination rule.) Let 
^ be a map which simply discards the lefthand-side of colon from a judgement 
M : A, namely, cj){M : A) = A. A type in Agc°™''^ and Adc can be regarded as 
a formula in implicational logic where — >■ is interpreted by implication, -'A is 
interpreted by A D_L, and Unit is a provable formula (say, _L— >-_L). The map p 
naturally extends to type environments. 

Theorem 6 (Isomorphism between Adc°™‘'^/Adc and classical logic). Let 

r be a type environment, M be a preterm, and A be a type (a formula). Then 
r \- M : A holds in Adc°™'^ if o.nd only if 4>{F) h A holds in a classical implica- 
tional logic. The same thing holds for Adc- 
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Proof (only-if part). We only have to verify the following three typing rules 
are preserved through (f). 

(i) For the case of the delimiter, we have to infer F \- B from F U {~‘B} h B. 
(ii) For the case of calldc, we have to infer F U {~‘B} h A from F h ((Unit — 
A) ^ B) ^ A (Unit is some provable formula), (iii) For the case of throw, we 
have to infer F U {~^B} h C from F B. 

All of them can be proved in classical logic. 

Proof (if part) . 

We only have to show the classical reasoning can be simulated by To 

show this, we shall prove that, for any type A and B, there exists a closed term 
of type ((A — >■ B) — >■ A) — >■ A. This is shown by induction on the type A. 

For the base case (A is an atomic type K), the following figure shows that 
((AT — >■ B) — >■ AT) — )> AT is inhabited. 



{a; : AT} F a: : AT 
{a; : AT, a : -■A'} F throw^a; : B 

{y : {K — >■ B) — >■ K} F y : {K B) ^ K {a : -'AT} F Ax.throwaa; : K ^ B 
{y : {K — 7> B) — >■ AT, a : -i(Ar)} F y(Aa:.throwQ,a:) : AT 
{y : (AT — >■ B) — ?> AT} F #a?/(Aa:.throWaa;) : K 
{} F Aj/.#a 2 /(Ax.throw„x) : ((AT — >■ B) — >■ AT) — ?> AT 

For the induction step, let A be C — >■ B. By induction hypothesis, we have a 
closed term M of type {{D B) ^ D) ^ D. Then we can easily show that a 
closed term Xux.M{Xy.u{Xz.y{zx))x) has type ((A — B) — >• A) — >• A. Hence we 
can prove all the instances of Peirce’s law in Apc°™“’. 

The proof is even easier for Adc, since we can use an arbitrary type in place 
of K in the figure above. □ 

Note that we used the throw-operator only, which means that Adc°““’ (and 
Adc) without the calldc-operator is still a classical calculus. In fact, Sato and the 
present author developed calculi for the catch/throw mechanism pini which 
correspond to classical propositional logic. It is easy to see that the delimiter 
and the throw-operator can be understood as the catch and the throw operators, 
and such a subcalculus of Agc°™‘'’ (or Adc) constitutes a confluent subcalculus of 
the classical catch/throw calculi in |1 811 4| . 

Note also that our reduction rules can be thought as proof reductions in 
classical logic. We already gave the subject reduction for the reduction (6), and 
if we forget the terms, we obtain a proof reduction. The resulting proof-reduction 
is (by nature) similar to Griffin’s reduction rules for Felleisen’s control operator. 

The calldc-operator is not Classical 

On the other hand, if we eliminate the throw-operator from Adc°“"'’ (or Adc), the 
resulting calculus becomes strictly weaker in the logical sense. 

Suppose {} F M : A is inferred in Adc where M does not contain the throw- 
operator. We shall show that A is provable in minimal logic. To simplify the 
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matter, we assume M uses only one tag variable a. Suppose M contains k 
subterms of the form calldc^A^i, and the type of Ni is ((Unit — Ai) — >■ i?) — ?> 
Ai. We put Pi = (((Unit — >• Ai) — i?) — >• Ai) — >• Ai. The introduction rule 
of calldc is (when mapped by </>) provable if we assume each Pi. Hence we 
can regard the delimiter introduction rule (through </>) as eliminating Pi, - ■ ■ , P^ 
from the assumption list. In other words, our goal is to prove P \- B from 
P, Pi, ■ ■ ■ , Pk b B. But the formula (A^=i Pi ^ B) ^ B is provable in minimal 
logic (which can be shown by induction on k). 

From this fact, one may think that the calldc-operator may be expressible 
by standard combinators such as S and K, but we believe this is not the case. The 
proof term of the above theorem has the same type as our control operators, but 
it behaves quite differently (the latter term is not interesting in computational 
aspects). 

8 Conclusion 

Partial continuations were proposed by Felleisen and others and there are many 
researches on partial continuations since then. Compared to existing calculi for 
partial continuations, the characteristic feature of our approach is that our cal- 
culus is based on a type-theoretic framework. We showed that our calculus (i) 
enjoys the subject reduction property (ii) is confluent, (iii) does admit the stan- 
dard Curry-Howard isomorphism (by which it corresponds to classical logic). 
Hieb et al’s subcontinuation also has static scope, but their approach also lacks 
the type-safety property (which means that it sometimes generates uncaught 
partial-continuation object). Our approach can be thought as a refinement of (a 
typed version of) Hieb et al’s subcontinuations, and we believe that our calculus 
can be a basis of syntactic, type-theoretic analysis for partial continuations and 
other variations of control operators. 

Since we can abstract tags, we believe that many examples by Felleisen’s one 
and Danvy and Filinski’s one can be written in our calculus. In fact we already 
worked in tree-traversal example by Felleisen 0. 

Future Work. So far, several research topics are left for future work. The 
first target is the strong normalization (SN) of A common tool to show 

it is a type-preserving CPS-translation. We gave a CPS-translation for 
in our draft, but it does not preserve typing, so the SN of is an open 

problem. Other directions are expressiveness and application. In this paper, we 
confined ourselves to sequential programs, but as many authors pointed out, 
partial continuations are a useful tool for giving control over parallel/concurrent 
programs. Also, there should be applications for formalizing mobile computing. 
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Abstract. We introduce several structures between Church-style and 
Curry-style based on partially typed terms formalism. In the uniform 
framework, we study the static properties of the A-terms between the 
two styles. It is proved that type checking, typability, and type inference 
for domain-free A2 are in general undecidable. A simple instance of the 
second-order unihcation problem is reduced to the problem of type in- 
ference for domain-free A2. The typability problem is undecidable even 
for a predicative fragment of domain-free A2, called the rank 2 fragment. 
It is also found that making polymorphic domains free and the use of 
type-holes [ ] are independently responsible for the undecidability of the 
partial polymorphic type reconstruction problem. 



1 Introduction 

There are known three styles of (typed) A-terms, called Curry-style, Church- 
style, and domain-free style. For some systems such as simply typed A-calculus 
and ML |21I9| . it is well-known that the Curry-style and the corresponding 
Church-style are essentially equivalent [xni . Hence, the Curry system serves as a 
short-hand for the Church system. On the other hand, recently, Barthe, Sprensen, 
and Hatcliff 1^ introduced the notion of domain-free pure type system. Terms 
in domain-free style have domain-free A-abstraction. Barthe and Sprensen posed 
a question to know whether the problem of type checking is decidable for domain- 
free A2 and Aw (page 18 P|). In this paper, we will show that type checking, 
typability, and type inference are, in general, undecidable for domain- free A2. In 
order to prove this, we reduce simple instances of the second-order unification 
problem to the problem of strong type inference for domain-free A2. 

Original motivation for domain-free systems comes from a study of classical 
type system which is an extension of intuitionistic type theory together with 
classical rules such as double negation elimination. The domain- free systems are 
useful to give continuation-passing style translations isng which provide a cer- 
tain semantics of classical type system. Further, when we construct a polymor- 
phic call-by-value calculus with control operators such as callcc or ^-operators 
m, the Curry style cannot work for a consistent system. For instance, see the 
traditional counterexample (ML with callcc is unsound) by Harper&Lillibridge 

J. van Leeuwen et al. (Eds.): IFIP TCS2000, LNCS 1872, pp. 505-^^^ 2000. 
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m, and see also a proof-theoretical observation in m- Hence, domain-free A/x- 
calculus has been introduced in m, where the explicit type annotations for 
polymorphic terms play a role of choosing an appropriate computation under 
call- by- value. Our result in this paper also gives a negative answer to the pro- 
blem of type checking for second-order A/i-calculus in domain-free style, which 
is a variant of Parigot’s A/x-calculus in Curry style EH. 

Domain-free systems are also useful for a study of partial polymorphic type 
reconstruction. Boehm E| and Pfenning m have proven that the partial type 
reconstruction problem is, in general, undecidable for second-order A-calculus. 
The typability problem for domain-free A2 can be regarded as a special case 
of the problem of type reconstruction for partially typed terms. Our result in 
this paper means that the restricted problem of type reconstruction for parti- 
ally typed terms is still undecidable. Moreover, observation of the undecidability 
proof reveals that the typability problem is undecidable even for a predicative 
fragment of domain-free A2, called the rank 2 fragment pniTTj . This analysis 
also implies the involved result that the partial type reconstruction problem is 
still undecidable for the rank 2 fragment of second-order A-calculus, contrary to 
the decidable typability for the rank 2 fragment of A2 in Curry style HZ!. From 
the viewpoint of partially typed terms, we introduce fine structures between 
Church-style and Curry-style, including the domain- free style. In the uniform 
framework, we study the static properties of the A-terms between the two styles. 
It is found that making polymorphic domains free and the use of type-holes [ ] 
are independently responsible for the undecidability of the partial type recon- 
struction problem. In this sense, this work can give a guide to the construction 
of typed languages with decidable type checking and typability. 



2 Curry-Style, Church-Style, and Domain-Free 

In Curry-style, terms are essentially type free mm, and types can be assig- 
ned by rules of a type theory if well-formed. Terms in the Church-style typed 
A-calculus, on one hand, are originally defined only from variables uniquely type 
annotated jOj . Following Curry’s philosophy, today one has the notion of pseudo- 
terms Q separated from a type theory. On the other hand, terms in domain-free 
(DF) style have domain-free A-abstraction Eli £^nd second-order A-calculus in 
domain-free style can be regarded, in a sense, as an intermediate representation 
between a la Curry and a la Church, as shown in the following table: 

Types (7 ::= t | u — >■ u | Vt.cr 

Styles of (typed) A2-terms 



A2-(pseudo)terms 


object-var abst. 


term app. 


type-var abst. 


type app. 


Church-style 


\x:a.M 


MM 


Xt.M 


M[a\ 


Domain-Free 


Xx.N 


NN 


Xt.N 


iV[cr] 


Curry-style 


Xx.U 


UU 
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We give a definition of domain- free A2-calculus. In terms of domain-free pure 
type systems m. this domain-free system is constructed from sorted variables; 
a metavariable for variables of the first sort (term variables) is x and a metava- 
riable for variables of the second sort (type variables) is t. Then, on the basis 
of the sorted variables, type abstraction can be represented by Xt rather than 
the traditional At, and we also have explicit distinction between terms and types. 



Type Assignment Rules for Domain-Free A2 



rh X : r{x) 



T h iVi : O'! — ?> (72 T h iV2 : (Ti 

r h N 1 N 2 : (72 



H E) 



r, X : ai \- N : a2 
r h Xx.N : (7i — (72 



H I) 



rh N : Vt.(7i 
r h N[a2] ■ <Ji[t := CT 2 ] 



m 



TF TV : (7 
r h Xt.N : yt.a 



(V/)* 



where (V/)* denotes the eigenvariable condition. 

The introduction rules, (— >■ I) and (V/) can be coded, respectively, as domain- 
free A-abstractions based on the distinction between the sorted variables. The 
elimination rules, (— >■ E) and (VA) can also be represented, respectively, by the 
pairs of two expressions, based on sorted variables. Hence, when well-typed terms 
of domain-free A2 are given, the type assignment rules are uniquely determined 
by the shape of the terms. From this syntactical property of terms, we have the 
natural generation lemma for domain- free A2. 



Lemma 1 (Generation lemma) (1) If E \- x : a, then E{x) = < 7 . 

(2) // T h TV 1 TV 2 : a , then E h TVi : (7i — >■ cr and E h TV 2 : a\ for some ai . 

(3) If E \- Xx.N : a, then E,x:a\ \~ N : a 2 and ct = cti — >■ (72 for some ai and 
0-2 ■ 

(4) If E \- Xt.N : (7, then E \- N : ai and a = Vt.(7i together with t ^ FV{E) for 
some a\. 

(5) If E \- TV[(7i] : (7, then E \- N \ \lt.u 2 and a = a 2 \t := ui] for some (72. 



3 Type Checking, Typability, and Type Inference for 
Domain-Free A2 

The problem of type inference is a problem; given a term M, are there a context 
E and a type a such that E h M : a is derivable? On one hand, the problem of 
strong type inference PD] is a problem; given a term M and a context Jq, are 
there a context E A Eq and a type a such that E h M : a is derivable? The 
typability problem is a problem; given a term M and a context E, is there a 
type (7 such that E \- M ■. a is derivable? Finally, the type checking problem is a 
problem; given a term M, a type a, and a context E, is the judgement E \- M : a 
derivable? The problems of (strong) type inference, typability, and type checking 
are denoted, respectively, by ? F TVf :?, T h AT :?, and E \- M : al . 
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The problems of type checking, typability, and type inference for Curry and 
Church A2 are investigated by Jutting [^, Wells and Schubert |28I29| . as 
shown in the following table: 

Decidability for type checking, typability, and type inference of A2 



A2 


Ph M : a? 


Th M :? 


? h M :? 


Church-style 


yes 121] 


yes 1211 


no |2B| 


Domain-Free 




?2 


?3 


Curry-style 


no [3^ 


no P2j 


no [32] 



In this paper, we will prove that in the table above, all of ?i, ? 2 , and ?3 (case of 
strong type inference) are “no”, i.e., undecidabl^l- 

By the use of a type forgetful map, the three styles of judgements are equi- 
valent in the following sense, where | | is a domain erasing map (|Ax : a.M\ = 
Aa;.|M|), and || || is a type erasing map (||M(t|| = ||M||, ||A<.M|| = ||M||): 

(1) li r \- M : cr in Church style, then T h |M| : cr in domain- free style. 

(2) If T h M : CT in domain-free style, then F h ||M|| : cr in Curry style. 

The inverse directions say that there exists a term whose erasure is the same 
as the original term ESI. 

(-1) li r \- M : cr in domain-free style, then there exists M\ such that F h M\ : a 
in Church style and \Mi \ = M. 

(-2) If T h M : cr in Curry style, then there exists M 2 such that F h M 2 ■ a in 
domain-free style and HM 2 II = M. 

For the problems above, however, the inverse directions are not straightfor- 
ward, since the forgetful maps are not one-to-one. We have to check that the 
given term is the same as an erasure of some term, or that the given term has 
the same type as that of the erasure, see also Section 7. Here, we will directly 
study the problems for domain-free A2. 

On the basis of the generation lemma (Lemma OJ, we first observe that the 
(strong) type inference problem for domain-free A2 is reduced to the typability 
problem, and the typability problem for domain-free A2 is reduced to the type 
checking problem. 

Lemma 2 3F3a. F, Fq \~ M : a in DF A2 3a. Fq h Xlt .M : a in DF A2 

Fq F {Xx.Xy.y){Xlt.M) : t ^ t in DF A2 

4 Terms with Partial Type Annotations 

In order to study static properties of A-terms in a uniform framework, we intro- 
duce partially typed terms (preterms) and type assignment rules for preterms. 

Partially typed terms (preterms), denoted by P,Q, are: 

P::=x\ Xx:a.P \ PP \ Xt.P \ P[cr]“ | Xx.P \ P[ ]“ 

^ After completing HIES, a correspondence with Gilles Barthe informed the author 
that all of ? are independently proved undecidable in [S|. For the detail, see also the 
footnote in Section 7. 
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where the mark [ ]“ must be left to indicate that a type has been erased. Moreo- 
ver, the label a in [ ]“ will be used to identify the occurrences of [ ], i.e., the 
type-holes [ ] with the same label should be obtained by erasing the same type. 
This annotation is a natural constraint to the traditional definition of preterms 
Em and gives a useful information. The use of our type-holes [ ]“ plays the 
same role as existential quantification over types, see also Proposition 0 We 
often omit the annotation of a type-hole unless it is necessary to identify the 
occurrences. 

We say that a preterm P is a partial erasure of M in Church-style, denoted 
by P Pi M; Z\ if it is derived by the following rules, where Z\ is a mapping from 
an annotation to a type: 



X Pi X] 



(var) 



PdiiM;A 

Xx : cr.P Pi Ax : a.M; A 



(abst— var) 



Pdii M;A 
Xx.P Pi Xx : a.M] A 



(abst— df) 



Pi <1 Mi; A P2 AiM2]A 
P 1 P 2 Pi MiM 2 ]A 



(app) 



P <iM;A 
Xt.P Pi Xt.M;A 



(abst— type) 



Pdii M;A 
P[cr]“ Pi M[cr]“;Z\,cr“ 



(app-type)# 



P <iM;A 
P[ Y Pi M[cr]“;Z\,cr“ 



(app— hole) ^ 



Here, j) denotes the condition such that if A{a) ^ 0 then A{a) = a. 

We now consider special cases of preterms, which give fine structures between 
Church-style and Curry-style, as follows: 

(1) Domain- free terms: 

Preterms which are derived by the use of all the rules but (abst-var) and (app- 
hole). 

(2) [ (-application terms: 

Preterms which are derived by all the rules but (abst-df) and (app- type). 

(3) DF&[ ] terms: 

Preterms which are derived by all the rules but (abst-var) and (app-type). 

A term in Curry-style is here regarded as a full erasure that is obtained from a 
term of DF&[ ] by deleting both Xt and [ ]. 

We also say that a preterm Pi is a partial erasure of a preterm P 2 , denoted 
by Pi P P 2 if it is derived by the following rules: 



P1AP2 

X P X Ax : a. Pi P Ax : a.P2 

Pi P P2 Pi P P2 Pi AQi P2A Q2 

Ax. Pi P Xx:a.P2 Ax. Pi P AX.P2 P1P2 ^ Q1Q2 

Pi P P2 Pi A P2 Pi A P2 Pi A P2 

At.P1PAt.P2 P1HPP2H Pi[]PP2H Pi[]PP2[] 

It is clear that P constitutes a partial order on preterms. 
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We next define type assignment rules for preterms: 

:r{x);A 
r \-p P2 : CTi; A 



r \-p X 

r \-p Pi : ai ^ (T2; A 



P h. 



p P1P2 



0-2; A 



H E) 



P,x:ai \-p P : U2\ A 
P ^p \x\(Ji.P \ ai ^ (j2\ A 
P\-p P ■. Vt.CTi; A 






P l-p -P[o' 2 ]“ : cri[t := CT 2 ]; A 



i^Ei)^ 



P\-pP 



a; A 



P \-p Xt.P 



Wt.a: A 



P,x:ai \-p P : CT2; A 
P \-p Xx.P : CTi — >■ (T2; A 
P hp P : Vt.CTi; A 
rhpP[]“ : (Ti[t := a2j;A 

(VI)* 



(VP2)“* 



where (V/)* denotes the eigenvariable condition, and jlH does A(a) = tT 2 - 

We may write P hp P : cr for P hp P : cr; Z\ and [ ] for [ ]“, unless it is 
necessary to use annotations for type-holes. Since a well-typed preterm can be 
uniquely decomposed by one of the assignment rules, we have the generation 
lemma below: 



Lemma 3 (Generation lemma) (1) If P \-p x ■. a\ A, then P(x) = a. 

(2) If P \~p P 1 P 2 : cr; A, then P \~p P\ : cti — >■ tr; Z\ and P h P 2 : cti; A for some 

ai. 

(3) If P hp Xx : CTi-P : cr; A, then P,x : ai \~p P : 02 ] A and a = ^ (J 2 for 

some (72. 

(4) If P hp Xx.P : a; A, then P, a; : cri hp P : < 72 ; A and cr = cti — >• ct 2 for some 
(71 and (J 2 - 

(5) If P hp Xt.P : a; A, then P \~p P : ai; A together with cr = Vt.ai and 
t ^ FV(r) for some CTi. 

(6) If P hp P[cTi]“ : a; A, then P h P : Vt.a 2 ',A together with a = a 2 [t := cri] 
and A(a) = cri for some (72. 

(7) If P hp P[ ]“ : cr; A, then P h P : Vt.a 2 \ A together with a = a 2 [t := <Ji] and 
A(a) = (7i for some (72 . 

The above mentioned generation lemma allows to deduce the following relati- 
onship between Church-style judgements and judgements of preterms. 

Lemma 4 (1) If we have P \~p P\ \ a, then P \~p P 2 '. a for any P 2 A Pi- 
(2) If we have P \~p P : a; A, then P \- M : a in Church style for some 
M Pi P;A. 

The problem of partial type reconstruction |25l2b] is a problem; given a context 
P and a preterm P, is there a Church-style term M such that P \- M : a holds 
in Church-style for some type cr and that P A 1 M; A for some Z\? From the 
lemma El tbe partial type reconstruction problem is equivalent to the following 
typability problem: 

Given a context P and a preterm P, is there a type cr such that P \~p P : a; A 
is derivable for some A7 

We say that a preterm P is a normal form if P contains a subterm in the 
form of neither (Xx:a.P\)P 2 , (Xx.Pi)P 2 , (At.Pi)cr, nor (At.Pi)[ ]. 
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5 Type Inference is Undecidable for Domain-Free A2 

In this section, we prove that the problem of strong type inference for domain- 
free A2 is undecidable. To show this, we demonstrate a stronger result such that 
the problem of strong type inference is undecidable for the predicative fragment 
of domain-free A2, called domain-free ML. Strictly speaking, the following sy- 
stem is a subsystem of the so-called ML, however such a subsystem is enough to 
establish the undecidability. 

Domain-free (DF) ML: 

Monotypes r ::= t | r — >■ t Polytypes cr ::= r | Vt.cr 

Contexts F ■.•={) \ x\a,r 

Terms M ::= x \ Xx.M \ MM \ a;[ri] • • • [r„] 

Type Assignment Rules 



F{x) = Vti • --tn-T 

r h x[ti] ■ ■ ■ [r„] : r[ti := n, • • • := r„] 



(n > 0) 



F,x:ti \- M : T2 F \- M\ : ti — >■ T2 F \~ M 2 '■ ti 

F h Xx.M : Ti — >■ T 2 T h M 1 M 2 : T 2 

Remarked that [ ]-application ML can also be defined similarly. 

We first introduce a restricted second-order unification problem for well- 
formed second-order expressions, which are defined from monotypes r, binary 
function constant — >■, and n-arity second-order function variable A (n > 0) whose 
arguments contain no variables. Such terms for the second-order unification are 
denoted by T or {7. A well-formed expression is defined as follows: 

(1) A type variable t is a well- formed expression. 

(2) If X is an n-arity variable (n > 0) and r* (1 < t < n) are monotypes, then 
Ati • • • T„ is well-formed. 

(3) If Ti and T 2 are well- formed, then so is Ti — >■ T 2 - 

Given a well-formed expression T, a set of (unification) variables in T de- 
noted by uVar{T) and a set of constants in T denoted by Con{T) are defined, 
respectively, as follows: 

uVar(t) = 0; uVar{Xri ■ ■ ■ Tn) = {A} (n > 0); 
uVar{Ti — >• T 2 ) = uVar{Ti) U uVar{T 2 ). 

Con{t) = {t}; Con{XTi ■ ■ ■ Tn) = 0 (n > 0); 

Con{Ti — >■ T 2 ) = Con{Ti) U Con{T 2 ). 

Given well-formed Ti and T 2 . Let uVar{Ti,T 2 ) be uVar{Ti) U uVar{T 2 ) = 
{Ai,---,A„}. The unification problem (Ti = T 2 ) is a problem to find well- 
formed Ui for each Xi (1 < i < n), such that 

(1) Let Xi be A:(t)-arity variable, and S' be a substitution such that 

[Ai := Xti ■ ■ ■ • • • , A„ := Xti ■ ■ ■ tfc(„).f7„]. Then S(Ti) =0 S(T 2 ) holds. 

(2) We have uVar{Ui) = 0 for 1 < i < n. 

If we have a substitution S such that the above (I) and (2) are satisfied, then 
we say that T\ and T 2 are unifiable, and that the unification problem has an 
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answer S. In this case, from the definition, there exists a monotype r such that 
S{T^)=0T=p S{T2). 

Theorem 1 (Schubert [28] h The second-order unification problem on the well- 
formed expressions is undecidable. 

Schubert m has proved that the halting problem for two-counter automata is 
reduced to the unification problem, where a two-counter automaton can simulate 
an arbitrary Turing machine. 

In order to give a reduction from the unification problem to the problem of 
type inference for domain-free ML, we first define a (pre)context S. The context 
itself may not be an ML-context, but it becomes an ML-context under some 
substitution if unifiable. This can be justified, since the reduction is formalized 
as follows: the unification problem T± = has an answer if and only if there 
exist T and r such that T, Tq h M : t in domain- free ML. Here, Iq and M are 
given by T\ and T 2 , where Iq consists only of monotypes in Con{Ti,T 2 ). Only if 
Tj and T 2 are unifiable, say the unifier S, then the ML-context T can be obtained 
as a subcontext of S{E), such that S{E) = T, /q. Moreover, the monotype r 
can also be obtained as a substitution instance (of ty{-) defined below) by S. 
Given a well- formed T, then we construct the context T'[T], such that 

(1) For each t€Con{T), t is inhabited in S[T], i.e., L'[T](a;) = t for some x. 

(2) For each n-arity variable X GuVar(T) where n > 0, the universal closure 
Vti • • • tnfXti ■ ■ ■ tn) is inhabited in X\T], 

E\Ti,T 2 \ is also defined similarly, and we simply write S for S\Ti^T 2 \. 

Let T\ and T 2 be well- formed. Given a second-order unification problem 
Ti = T 2 , then, following Pfenning m, we construct a term of domain-free ML 
by the use of the following IdT and TT: 

UT{E[T,,T2];Ti=T2) = 

Azi.A 22 .A/.M(/z 2 (A<?. 5 (ri(r; zi; n))(TI{E- Z 2 ; T 2 )))), 
where TI{E\ z; T) is defined by induction on T\ 

(1) TliX] z; t) = \f .f zifxiXg.g)) where Six) = t G Con(Ti,T 2 ) 

(2) TI{S; z; Xn • • • r„) = A/./z(/(a:[ri] • • • [r„])(Ag.g)) 

where S{x) = Vti • • • tnfXti ■ ■ ■ tn) 

(3) TX{S- z;Ti ^T 2 ) = 

Azi.Az2.A/./(zzi)(/z2(A5.5(ri(r; zi-T^)){TT{S- Z2;T2)))) 

Remarks 1 The reduction via lAT and TT gives a (d-normal term. 

The translation TT(A7; z; T) says that type of z would be a substitution instance 
of r, see Lemma El below. 

We next construct ty{T) that is a type of TT{S; z;T). Although ty(T) itself 
may not be a monotype, it becomes a monotype under some substitution if 
unifiable. Here of course we assume that we have countably many type variables 
to use a fresh type variable t for each application of the following definition: 

(0) ty{r) = {t ^ ft ^ f) ^ t ^ t) ^ t ^ t ior T G Con(T); 

(1) ty{XT\ ■ ■ ■ Tn) = {{Xti ■ ■ ■ Tn) — where n > 0; 

(2) ty{Ti ^ T 2 ) = Ti ^ T 2 ^ {T 2 ^ A ^ A) ^ A 

where A = fty{T{) — >■ ty{T 2 ) f) ^ t. 
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Lemma 5 S{S\T\),x:t h TX{S\T\\x',T) : S(ty{T)) in domain-free ML if and 
only if S{T) =0 r. 

Proof. By induction on T. We show the following two cases: 

(1) Case of T = Xti • • • r„ 

Let S{E{y)) be Vti • • • tn.{{SX)ti ■ ■ ■ tn). We have 
S{E),x:t h Xf.fx{f{y[Ti] ■ ■ ■ [Tn]){Xg.g)) : 

(((*S^)ti * * ■ Tn') — ^ if — ^ t) — ^ t — y t) — y t — y t 
in domain-free ML. Here, types of x and • • • [tu] must be equal. That is, 
T=0 {{SX)ti - ■ - Tn). 

(2) Case of T = Ti ^ Ta 

From the definition, in domain-free ML we have 

S{E),x-.t^ Xz^.Xz 2 .Xf.f{xz^){fz 2 {Xg.g{rT{E-zi-T^)){rT{E-Z 2 -,T 2 )))) : 

S{T{) ^ 5(Ta) ^ (^(Ta) ^ A ^ A) ^ A 
where A = {S{ty{Ti)) — >■ S{ty{T 2 )) t) ^ t. 

Then, we also have 
SiE),x:T,z,:S{T,),Z 2 :S{T 2 ) h 

Xf.f{xzi){fz 2 {Xg.g{TT{E- Zi, T^)){TT{E- ^a; Ta)))) : (5(Ta) ^ H ^ H) ^ H . 
Here, from the induction hypotheses, we have the following: 
5(r),zi:r3hri(r;zi;Ti):,5(ty(ri)) iff t^= 0 S{T^) 
S{E),Z 2 -Tiy-TX{E-Z 2 ]T 2 )-.S{ty{T 2 )) iff t 4 =/? ^(Ta) 

Now, type of {xz\) and za must be equal, i.e., r =0 S{Ti) — >■ 5'(Ta). □ 

Lemma 6 (main lemma) S{Ti) =0 <S'(Ta) if and 

only if S{E) \- UT{E;Ti = Ta) .' S{ty(Ti — )> Ta)) in domain-free ML. 

Proof S(E) h UT{E- Ti = Ta) : S{ty{Ti ^ Ta)) 

iff S{E), zi : 5(Ti), za : 5(Ta) h Xf.fz,{fz 2 {Xg.g{TI{E; zi; n))(TI{E; za; Ta)))) 
: (^(Ta) ^A^A)^A where A = (S(ty(Ti)) S'(tt/(Ta)) ^ t) ^ t 
iff (LemmalSI) S{Ti) =0 ^(Ta). □ 

Proposition 1 The unification problem on the well-formed expressions is redu- 
ced to the problem of strong type inference for domain-free ML. In other words, 
S{Ti) =0 S'(Ta) 3T.3r. P,rQ^'^ h : r in domain-free ML. 

Proof. (=>): From Lemma 0 Pq^’^ and are determined by T\ and Ta, 

such that for each t G Con(Ti,Ta), we have Tg(a;) = t for some x, and that 
M = UT{E] Ti = Ta). Then the unifier S gives T and r, respectively, such that 
5(r[Ti,Ta]) = T,To and S{ty{Ti ^ Ta)) = r. 

(<t=): Given Pq^’^ and and assume that there exist r and T such that 

T = {x\ : Ti, • • • , Xm '■ Vti • • • tn-Tm}- For each Xi € uHar(Ti, Ta), assume that 
E(xi) — Xi, • • • , E(xm) = Vti • • • tn.{Xjnti ■ ■ ■ tn) . Then an answer to the trivial 
second-order matching problem such that Xi = ri, • • • , ■ ■ ■ tn) = Tm finds 

a matching S for ty{T\ — >■ Ta) = r, since if S'(L'[T]), a; : tq F TT{E\T\\x\T) : r 
for some r, then Sfty(T)) =0 r. From Lemma El the unifier S is an answer to 
the unification problem Ti = Ta. □ 
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Proposition 2 (no(nf, str)^^ The problem of strong type inference is undeci- 
dable for domain-free ML, even when the given term is a normal form. 

Proof. From Theorem |3 Proposition P, and Remark [U □ 

Remarks 2 (no(nf, str)^^^ The proof of undecidability for strong type infe- 
rence is also applicable to that for DF&r,[ ] terms in ML-fragment. 



Theorem 2 (no-®, no(nf)’®). Type checking, typability, and strong type infe- 
rence are undecidable for domain-free A2. 

Proof. From Proposition O and Lemma 0 Moreover, even in the case where the 
given term is a normal form, typability and strong type inference for domain- free 
A2 are still undecidable. □ 

The problem of typability becomes undecidable for some predicative exten- 
sion of the domain- free ML. We introduce a predicative fragment of domain-free 
A2, called domain-free ML 2 . This extension allows us to abstract a term varia- 
ble with a polymorphic type a (polymorphic abstraction), but not to apply a 
polymorphic function to a polymorphic type (i.e., only to a monomorphic type 
r). For this purpose, an extension of type schemes is introduced as follows: 
p ::= r I (T — >■ p 

This type p belongs to the so-called S(2)-class in which is a special form of 
restrict types of rank 2 m- Following a set of types with rank-fc, denoted 
by R{k) is defined as follows: 
i?(0) = set of monotypes; 

R{k + l) = R{k) I R{k) R{k + 1) | \/t.R{k + 1). 

Domain-free (DF) ML 2 -fragment: 



P{x)='iti---tn.T ^ 

P F x[t{\ ■ ■ ■ [t„] : r[ti := n, • • • := r„] ^ “ 

r h Ml : Ti — >■ T2 P F M2 ■ Ti r,x:a F M : p 

r F Ml M2 : T2 P F Xx.M : a ^ p 

The problem of strong type inference in domain- free ML can be reduced to the 
typability problem in domain-free ML 2 . That is, let {x\, • • • , Xn} be a set of free 
variables in M; 3a i ■ ■ ■ (r„.3r. Tg, a;i : CTi, • • • , : cr„ h M : r in DF ML-fragment 

if and only if 3 cti • • • an.3r. Pq F Aa;i • • • Xn-M : tri —>■•••—>■ —>■ r in DF 

ML 2 -fragment. From PropositioiEl we can obtain that the typability problem is 
undecidable for the rank 2 fragment of domain-free A2. This result means that 
the partial type reconstruction problem is still undecidable even for the rank 2 
fragment of A2. 

Corollary 1 (no(nf)^^®) The typability problem for DF ML 2 is undecidable. 
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Remarks 3 From Lemma\^ the type ehecking problem for domain-free 

terms beeomes undecidable at rank 3. 

Remarks 4 (no(nf)^®) From remar)^^ the typability problem for DFSz[] terms 
in ML 2 is undecidable. 

6 Static Properties of [ ] -Application 

In this section, we study type inference, typability, and type checking with res- 
pect to [ ]-application terms. That is, the preterms have the following structure: 
P::=x\ Xx-.a.P \ PP \ Xt.P \ P[ ]“ 

We first give a translation [ ] from a term in Church-style to [ ]-application 
terms, which is motivated by the translation |S| from Church-style to domain- 
free style. Let id = Xt.Xx:t.x. Then id : \/t.(t — )> t). 

[x] = x; \Xx:a.M~\ = Xx:a.\M~\] 

[MiMal = [Ml] [M 2 ]; [At.M] = At.[M]; 

[M[cr]] = {Xv.a — cr.[M] [ ]“)(Ax:cr.id[ ]“x) where r; is a fresh variable. 

For any term M in Church-style, [M] is a preterm of [ ]-applications. 

Proposition 3 F \- M : a in Church-style A2 iff F \~p [M] : a. 

The problem of type inference for A2 in Church-style is reduced to the (strong) 
type inference problem for preterms of [ ]-applications. Hence from the undeci- 
dability of type inference for A2 in Church-style the (strong) type inference 
problem for preterms of [ ]-applications is undecidable. 

Theorem 3 (no-^l^. The (strong) type inference problem for [ (-application 
terms is undecidable. 

Remarks 5 (no^^ Following I2S], since the type inference problem is undeci- 
dable for ML-fragment in Church-style, the (strong) type inference problem for 
[ (-application terms becomes undecidable in the ML-fragment. 

By the use of [ ]- applications, we give a translation which moves existentially 
quantified types in the context to the right-hand side of hp. 

Proposition 4 Bai ■ ■ ■ an .dcr. xi'.ai, - ■ ■ ,Xn'.a„\-p P : a if and 

only if 3a. yi : Mt.t, • • • , : Vt .t hp (P) : a, 

where (xf) = yi[ ]“% {x) = x if x ^ Xi (1 < i < n); 

(Xx-.a.P) = Xx:a.(P); (P 1 P 2 ) = (Pi)(P 2 ); 

(Xt.P) = Xt.(P); {P[a() = {P)[a(; {P[ ]) = (P)[ ]. 

The variable x declared in F is translated to y[ ]“, and the new variable y 
appears only in the form of ?/[ ]“ in (P). The problem of type inference for [ ]- 
application terms is reduced to the typability problem for [ (-application terms. 
From Propositions 0 and 0 the typability problem for [ (-application terms is 
essentially equivalent to type inference in Church-style. From Theorem 0 the 
typability problem for [ (-application terms becomes undecidable. 



516 



K.-E. Fujita and A. Schubert 



Theorem 4 (no-^^. The typahility problem for [ ]-application terms is undeci- 
dable. 

Remarks 6 From Remark\^ the typability problem for [ ]-application 

terms becomes undecidable at rank 2. This rank 2 fragment is impredicative, i.e., 
cannot be the ML-fragment, since type schemes {rank 1 types) can be substituted. 

We next give a reduction from typability to type checking by the use of 
[ ]-applications. 

Proposition 5 3a. T \-p P : a if and only if F,x:\/ti.{ti t 2 ) \~p x[]P : t 2 . 

The problem of typability for [ ]-application terms is reduced to the type checking 
problem of [ ]-application terms. From Theorem El the type checking problem 
for [ ]-application terms becomes undecidable. 

Theorem 5 (no-^^^. The problem of type checking for [ {-application terms is 
undecidable. 

Remarks 7 (no^ From Remark[^ the type checking problem for [ {-applica- 
tion terms becomes undecidable at rank 3. 

Remarks 8 (no(nf)^^^ From Remark^ and Proposition^^ the type checking 
problem for DF&[ ] terms is undecidable at rank 3, even when the given preterm 
is a normal form. 

7 Related Work and Concluding Remarks 

7.1 Related Work and Summary of Results 

Relating to Proposition |2l the problem of strong type inference is also undecida- 
ble for domain- free ML with non-sorted variables ^ , since the given proof with 
a slight modification still works for the definition of T, where type variable t is 
replaced with variable x, and 
T ::= a; I T — ?> r 

by the use of a single syntactic category of variables x. This implies that strong 
type inference is undecidable for domain- free A2 with non-sorted variables. 

Our result also implies a negativ^ answer to the question posed by Barthe 
and Sprensen ^ to know whether the problem of type checking is decidable for 

^ Following El, type checking of domain- free A2 becomes decidable when the given 
term is in /3-normal form. By the use of domain-erasing translation from Church- 
style to domain-free A2 inducing /3-redexes, Barthe and Sprensen 0 proved that 
the type inference problem is undecidable for domain-free A2 with sorted variables. 
Although both works are based on m, the distinction is that our reduction induces 
no redex, see Remark Q and our method with a slight modification is also applicable 
to the strong type inference problem of domain-free A2 with non-sorted variables. 
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domain-free A2 and Aw. Moreover, the type checking problem for domain- free 
A^-calculus introduced in m also becomes undecidable. 

Besides already known ones, we summarize results obtained here, written in 
italic type marked with superscript, “no (nf)” means that the problem is unde- 
cidable even if the given term is a normal form, and “(str)” denotes the strong 
type inference problem, “yes (nf)” means that the problem becomes decidable 
only if the given term is a normal form. 

For positive results, it is noted that Odersky and Laufer m have proposed 
a decidable extension of ML with fully explicit type scheme annotations so that 
the system can express polymorphic function arguments. See also Garrigue and 
Remy m for a conservative extension of ML with semi-explicit first-class poly- 
morphism. 

Decidability of type inference, typability, and type checking for A2 



A2 


Church 


Domain-Free 


[ ]-App. 


DF&[ ] 


Curry 


? h P :? 


/ nl2HI 

no(ni)* — * 


no{nf, str)^ 


noM 


no{nf, str)^ 


noE21 


r h P :? 


yesliiil 


no(n/)^ 


no^ 


no{nf)^ 


noLU 


P h P : cr? 


yestiil 


yes(nf)^, no^ 


no^ 


no(n/)® 


not^ 



Decidability of type inference 



? h P :? 


Church 


Domain-Free 


[ ]-App. 


DF&[ ] 


Curry 


A2 


no(nf) 


no{nf, str) 


na^ 


no{nf, str) 


no^ 


rank 2 


no(nf) 


no{nf, str) 


no 


no{nf, str) 


[T7I 

yes(str)-^ 


ML 


/ PA 

no(ni)*“ 


no{nf, str)^ 


no^ 


no{nf, str)^ 


yesCa 



yes(str)^ and yes^ below provided that the given context assigns closed type 
schemes to object variables. 

Decidability of typability 



P h P :? 


Church 


Domain-Free 


[ ]-App. 


DF&[ ] 


Curry 


A2 


yesliiil 


no(n/)^ 


no^ 


no{nf) 




rank 2 


yes 


no(n/)*^ 


no^ 


no{nf)^ 


yesl^ 


ML 


yes 


yes 


yes 


yes 


yes 



The well-known W f2 1 It)] is also applicable to the typability for ML in domain- 
free, [ J-applications, and DF&[ ] cases. Remarked that typable terms at rank 1 
coincide with typable terms in ML-fragment, and that rank 1 types are applied 
to type-directed compilation 
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Decidability of type checking 



F h F : cr? 


Church 


Domain-Free 


[ ]-App. 


DF&[ ] 


Curry 


A2 


yes Ell 


yes(nf)^, no^ 




no{nf) 


no 


rank 3 


yes 


yes(nf), no^ 


no^ 


no(n/)'^ 


noE2l 


rank 2 


yes 


yes(nf), ? 


7 


7 


yesM 


ML 


yes 


yes 


yes 


yes 


yes 



7.2 Final Observation and Concluding Remarks 

It is worthwhile to mention the striking contrast between yes^^, yes(str)^^ and 

no(nf)^, no{nf, str)^, no^, no{nf, str)^. Pfenning has proved that the 
partial type reconstruction problem is undecidable for a predicative fragment of 
A2. However, the undecidability has been explicitly discussed neither from the 
viewpoint of rank nor type information of domains and holes m Our result 
shows that existence of type information makes type inference problems unde- 
cidable for ML-fragment, in contrast with the case of Curry-style m- From 
this, the partial type reconstruction problem becomes undecidable for the rank 
2 fragment of A2, and moreover making polymorphic domains free and the use 
of type-holes [ ] are independently responsible for the undecidability. The distin- 
ction between yes and no for ML-fragment can be observed in the following. 

Given a term M in Church-style ML. Then consider a problem to find a 
context and a type such that ? h M :? holds in Church-style ML. First consider 
the typability problem of || \M\ || in Curry-style ML. Let {xi, • • • ,Xm} be a set 
of free variables in M. Let F be Xi : Vf.t, • • • , Xm-^t-t. If || \M\ || is typable, then 
we have the principal type t of || \M\ ||, such that W(T, || \M\ ||) = (S', r) and 
F h II |M| II : r in Curry-style ML. If || |M| || is not typable, then M is not typable 
in Church-style, see (1) and (2) in Section 3. 

Assume that || |M| || is typable. Let a; be a free variable appeared k times in 
M (fc > 1). From the derivation of F h || \M\ || : t in Curry-style ML, we obtain 
principal monotypes for each occurrence of x in || |M| ||, say x : t\, - ■ ■ ,x : t ^. 
Without loss of generality, assume that x appear as one of the following forms 
in M: 

(1) X is used polymorphically in M, such that 



{ x[tix][tx 2\ ■ ■ ■ [Tin] 
a;[rfci][rfe2] ■ • ■ [rkn] 

(2) k occurrences of x have no type applications in M . 

In order to solve the original type inference problem (? h M :? in Church-style 
ML), we have to check that type of the given term M can be the same as the 
inferred type r of the erasure, see also (-1) and (-2) in Section 3, and then at least 
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we have to find an answer to the following second-order unification problem: 
Case of (1) 



FthTi2 • • • ti„ = n 



FT}^\T}^2 ' ' ‘ '^kn — '^k 



where F is an unknown second-order variable with n-arity, in which every ar- 
gument of F contains no variables for the unification problem. Type variables 
in Ti, • • • , Tfc are used as unknown first-order variables for the unification. This 
unification problem is a variant of the problem in Section 5, i.e., the simple in- 
stances of the second-order unification problem with first-order variables. 

The case of (2) gives the first-order unification problem: 

Ti=T 2 = ■■■ = Tk 

Following the observation above, W plus an answer to the second-order unifi- 
cation problem could solve the problems of type inference in the cases of Church, 
domain-free, [ ]-applications, and DF&[ ], i.e., additional type information gives 
more constraints to be solved. 

Another observation is about the definition of preterms. Alternatively, one 
can define preterms with full annotations, such that 
Q ■.:= X \ Xx:[a]^.Q \ QQ \ Xt.Q \ Q[cr]“ | Ax: [ ]“.Q | Q[ ]“ 

Then F \-p Q ■. cr\ A can be similarly defined, and the same results in 7.1 also 
hold for the preterms following a similar pattern. 
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Abstract. We give a nniform treatment of the logical properties of al- 
ternating weak automata on infinite strings, extending and refining work 
of Muller, Saoudi, and Schupp (1984) and Kupferman and Vardi (1997). 
Two ideas are essential in the present set-up: There is no acyclicity re- 
quirement on the transition structure of weak alternating automata, and 
acceptance is defined only in terms of reachability of states; moreover, the 
run trees of the standard framework are replaced by run dags of boun- 
ded width. As applications, one obtains a new normal form for monadic 
second order logic, a simple complementation proof for weak alternating 
automata, and elegant connections to temporal logic. 



1 Introduction 

Finite automata on infinite strings provide a useful framework for the logical 
analysis of sequence properties. The connection to logic is based on (at least) 
the following four aspects: 

— Nondeterministic Biichi automata are expressively equivalent to monadic 
second-order logic (MS 0-logic) over infinite strings ( |Bii^ ). This equiva- 
lence involves a normal form of MSO-formulas in EMSO-logic (existential 
monadic second-order logic). 

— Connected with this fact is the closure of Biichi automata under complement. 

— A hierarchy of acceptance conditions for deterministic w-automata induces 

a natural classification of sequence properties (cf. including, for 

example, safety properties and recurrence properties. 

— Propositional temporal logic PLTL, a standard framework for the specifica- 
tion of infinite computations, is characterized by counter-free deterministic 
Muller automata, defined by a natural restriction on the loop structure of 
transition graphs. 

In the present paper, we study these logical aspects of w-languages in the frame- 
work of alternating weak automata, a model introduced in the pioneering work 
of Muller, Saoudi, and Schupp pdSS86j . We introduce a variant of alternating 
weak automata which differs from the model of fMSS8fi| in the following way: 

J. van Leeuwen et al. (Eds.): IFIP TCS2000, LNCS 1872, pp. 521-^^^ 2000. 

@ Springer- Verlag Berlin Heidelberg 2000 
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There is no acyclicity requirement on the transition structure of weak alterna- 
ting automata, and acceptance is defined only in terms of reachability of states; 
moreover, the run trees of the standard framework are replaced by run dags 
of bounded width. (For the equivalence to the model of [MSS8ti| see the next 
section.) Starting from this, it turns out that all four aspects mentioned above 
have counterparts in the framework of alternating automata: 



1. The equivalence between alternating weak automata and monadic second- 
order logic over infinite strings provides a new normal form of MSO-formulas, 
giving an alternative to the classical EMSO-normal form (for the specifica- 
tion of accepting runs). 

2. The complementation of alternating weak automata is presented in a game 
theoretic setting, based on a determinacy result on infinite games with win- 
ning conditions in terms of reachability of states. (For a separate exposition 
of this result see ITCTn .l 

3. The basic classification of temporal properties (called Landweber hierar- 
chy in the automaton theoretic setting) is captured in the framework of 
alternating automata in two different ways: by restricting the alternating 
computation mode (to universal, respectively existential branching) and by 
restricting the acceptance component in alternating automata. 

4. A structural property (on the loop structure) of weak alternating automata 
is presented, which characterizes the properties definable in the temporal 
logic PLTL. 



As mentioned above, we define acceptance by alternating automata using run 
dags instead of run trees. In prVTTT) . alternating automata are defined as in 
[IMSSHti] (however with the co-Biichi acceptance condition), and then a reduc- 
tion to acceptance via run dags is carried out. In both cases, the approach via 
run dags does not weaken the expressive power due to the fact that in the asso- 
ciated infinite games (see Section OJ memory less winning strategies are sufficient. 
The determinacy proof presented in this paper (of which a preliminary exposi- 
tion was given by the second author in |Tho99j l is related to a construction of 
Klarlund P<la,91j . The structural characterization of PLTL-definable properties 
was obtained independently by Rohde pHE97j . however with a more involved 
proof. 

The paper is structured as follows: In Section El we introduce alternating 
automata and their acceptance conditions. In Section El the dualization of al- 
ternating automata and its connection to determinacy of infinite games and to 
complementation is developed (see item 2 above). Section El shows that alterna- 
ting weak automata are able to recognize precisely the regular w-languages, via 
a transformation of parity automata into alternating weak automata. Finally, 
Sections El El and Q present the results mentioned above under item 1 (connec- 
tion to MSO-logic), item 3 (classification of sequence properties) and item 4 
(characterization of PLTL-definable properties) . 
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2 Alternating Antomata 

Alternating automata combine the possibility of existential and universal bran- 
ching. The transition function of an alternating automaton is defined with posi- 
tive boolean formulas over the state set. 

Let AT be a finite set. The set of positive boolean formulas over X, denoted 
by contains T (true), T (false), all elements from X, and all boolean 

combinations over X built with A and V. A subset S' of X is a model of 6 G B'^{X) 
iff the truth assignment that assigns true to the elements of S and false to the 
elements of X \ S satisfies 9. We say S is a minimal model of 0 iff S is a model of 
9 and no proper subset of S is a model of 9. For 9 G B~^(X) the set of minimal 
models of 9 is denoted by Adg. 

An alternating automaton A is of the form A = (Q, X,qo,S,AC), where 
Q is a finite state set, X is a finite alphabet, go S Q is the initial state, 
S : Q X E ^ B^{Q) is the transition function and AC is the acceptance 
component. There are several different types of acceptance conditions referring 
to different types of acceptance components. 

Since in an alternating automaton there is universal branching, a run of 
an alternating automaton is not an infinite sequence of states, but an infinite 
acyclic graph. This graph has a “root vertex” labelled with the initial state go 
and in distance I from this vertex one finds the states which are assumed by the 
automaton after I input letters. Formally a run is defined as follows. Let a G X“ 
and let G = (V, E) be a directed acyclic graph with the following properties. 



— For every (g, 1) G V\{(qo, 0)} exists a q' G Q, such that ((g', / — I), (g, 1)) G E. 

G is called a run of A = (Q, X, go, <5, A) on a, if for every (q,l) G V the set 
{q' G Q \ ((g, 1), (g', l + l)) G E} is a minimal model of 6{q, a{l)). An example is 
given in Figure E 

Note that there is no run G = (X, E) on a, such that (g, 1) G V for a g with 
6{q,a{l)) = X, since X has no models. 



Q = {<jo,gi,g2,g3}, X = {a} (qo,o) 



X C Q X IN with (go, 0) G X. 

XC U(Qx {^}) X {Qx{l + 1}). 



S(qo,a) = gi A g 2 
S{qi,a) = (gi A ga) V (g 2 A ga) 
= gi 

5(ga,a) = (gi A g 2 ) V ga 




(as, 2) 



(aa,3) 



Fig. 1. First segment of a run of an alternating automaton. 
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With this definition of the transition function of alternating automata we 
get deterministic, nondeterministic, and universal automata as special cases of 
alternating automata. In deterministic automata the formulas used in the tran- 
sition function consist of exactly one state. In nondeterministic automata the 
formulas are disjunctions of states or _L, and dual to that, in universal automata 
the formulas are conjunctions of states or T. 

For later use in Section 0 we note that it is possible to use alternating 
automata with an initial positive boolean formula 0q instead of a single initial 
state. Such an automaton can be converted into an equivalent automaton with 
a single initial state just by adding one extra state. 

An automaton accepts a word iff there exists a run of the automaton on 
this word such that every infinite path through that run satisfies the acceptance 
condition. The language accepted by the automaton consists of all the words 
that are accepted by the automaton. Here we identify an infinite path tt with 
the sequence of states induced by this path. The infinity set In{Tr) consists of all 
states that appear infinitely often in tt. The occurrence set Oc(7t) consists of all 
states that appear at least once in tt. The following different types of acceptance 
conditions are considered in this paper. 

In Biichi and co-Biichi automata the acceptance condition refers to a subset 
F of the state set and in parity automata the acceptance condition refers to a 
mapping (called coloring) c : Q — ?> {0,...,fc}. The numbers 0, . . . , fc are called 
the colors. For an infinite path tt the corresponding infinite sequence of colors is 
then denoted as c(7t). An infinite path tt satisfies 

— the Biichi condition w.r.t. F iff A fl /n(7r) yf 0, 

— the co-Biichi condition w.r.t. F iff F fl /n(7r) = 0, 

— the parity condition w.r.t. c iff min(Jn(c(7r))) is even. 

For all of these acceptance types we can also consider the “weak versions” . We 
call an acceptance condition weak if it is evaluated in the occurrence set instead 
of the infinity set of a path. So an infinite path tt satisfies 

— the weak Biichi condition w.r.t. F iff F fl Oc{n) 0, 

— the weak co-Biichi condition w.r.t. F iff F fl Oc{tt) = 0, 

— the weak parity condition w.r.t. c iff min(Oc(c(7r))) is even. 

In the present paper, we focus on weak automata with the weak parity acceptance 
condition. This differs from the model of weak automata as introduced by Muller 
and Schupp in |MSS86j where the Biichi acceptance condition is used. Moreover 
the transition structure of a weak automaton A = (Q, A, qq, S, F) must fulfill the 
following requirement: There is a partition Qi, . . . , of Q such that for every 
q € Qi and q' G Qj with a transition leading from q to q' one has j < i, and 
Qi C F or Qi n F = 0 for every i G {!,..., m}. 

Let us verify that the two models are equivalent in expressive power. Given 
a weak automaton A as above (in the sense of |AISS8fij l. acceptance means 
that for every path tt through a run of A there is an i G {!,..., m} such that 
/n(7r) Q Qi Q F. This can also be expressed as a weak parity condition because 
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a path through the run satisfies the acceptance condition if it enters one of the 
accepting Qi’s and never leaves it again. For alH G m} and for all q G Qi 

we define the coloring c as 



It is easy to see that the weak parity automaton A! = {Q, S, qq, S, c) is equivalent 
to A. 

Conversely, given a weak parity automaton A = {Q, S, qq, S, c) with c : Q — ?> 
C, let Ce be the set of even numbers in C and let Q' = Q x C, q'q = {qo, c{qo)), 
F' = Q X Ce- To define the transition function we need an auxiliary mapping 
(f) : X C — )► The formula is obtained by replacing every 

q G Q in 9 with (g, min{f, c(g)}). Then we define the transition function by 
S'{{q,i),a) = (j){6{q,a),i)- In the second component of the new states the auto- 
maton remembers the minimal color that was seen so far. Along a path through 
a run this color may not increase. Therefore, if we define the sets Qi according to 
the color in the second component, we get an automaton in the form of fMSSSfi] 
equivalent to A. 

3 Complementation 

In |MS87| Muller and Schupp show that complementation of alternating auto- 
mata can be done by dualization. In this section we give a self-contained proof 
of this complementation theorem for the case of alternating weak parity au- 
tomata. This is done in a game theoretic framework, and making use of the 
simple winning conditions which are derived from weak alternating automata. 
So we do not rely on difficult determinacy results, e.g. for Borel games, as done 
in |HSS2|. Before we turn to games we define the dual of an alternating weak 
parity automaton. 

For a finite set X and 9 G B~^(X) the formula 9, the dual of 9, is obtained by 
exchanging V and A, and T and T. We can state the following relation between 
the minimal models of 9 and the models of 9. 

Remark 1. Let 9 G B^(X). A set 5 C X is a model of 0 iff S' fl i? yf 0 for all 
minimal models R of 9. 

Proof. The formula 9 is equivalent to An^Me VxGfl®’ which is the conjunctive 
normal form of 0. S is a model of 9 iff it contains at least one element from each 
of the disjunctive terms. 

Let A = {Q, S,qQ,Sx) be an alternating weak parity automaton. The dual 
automaton M of M is defined as M = (Q, if, goj where i5 is defined by <5(g, a) = 
S{q, a) for all g G Q and a G S, and c is defined by c(g) = c(g) -|- 1. 

Since in M a state has an even color iff it has an odd color in A, we get the 
following remark. 
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Remark 2. A path tt satisfies J;he acceptance condition of A iff it does not satisfy 
the acceptance condition of A. 

Let A = {Q, E,qo,S^c) be an alternating weak parity automaton and let 
a G With A and a we associate a graph = {Va,Vp, E,w) which 

serves as a game arena for the two players Automaton (A) and Pathfinder (P). 
The graph is defined as follows. 

— Va = Q X IN and Vp = Q x (2^ \ {0}) x IN; let V always denote Va U Vp. 

— The edge relation E is defined by 

+ G E iff S € Ms{qp(i)), 
{{p,S,l),{q,l))GE iSqGS, 

for p,q € Q, S C Q and I G IN. 

— The coloring w : V ^ c{Q) is given by w{{q, 1)) = c{q) for (g, 1) G Va and 
w{{q,S,l)) = c{q) for {q,S,l) G Vp. 

A play of Gap is an infinite sequence 7 G {VaVpY such that 7(0) = (<7o,0) and 
(7(1), 7(1 + 1)) G A for all i G IN. Automaton wins the play 7 iff min(Oc(w(7))) 
is even. 

A positional strategy for A is a mapping Ja-Va^ Vp such that for all 
V G Va we have {v, fA{v)) G E. The play 7 is played according to /a iff for every 
i G IN with 7(z) G Va one has 7(1 + 1) = fA{"f{i)). The strategy /a is called a 
positional winning strategy for A iff A wins every play 7 played according to /a . 
Strategies for P are defined analogously. 

The connections between alternating weak parity automata and the corre- 
sponding games are stated in the following three lemmas. 

Lemma 1. Let A = (Q, E, qo, S, c) be an alternating weak parity automaton and 
let a G E^. Automaton has a positional winning strategy in Ga,ol iff a € L{A). 

The very simple proof is omitted. 

Lemma 2. (Determinacy of weak parity games) Let A = {Q, E,qo,S,c) be an 
alternating weak parity automaton and let a G E‘^ . Ln Ga.o either Automaton 
or Pathfinder has a positional winning strategy. 

Proof. We first define the notion of an attractor. Let T C Va U Vp. The A- 

attractor of T, denoted AttrA{T), is the set of vertices from which Automaton 

can force the play to eventually visit T. 

AttrA{T) = U Attr\{T), where Attr\{T) = T and 

ielN 

V G Attr^^(T) V £ Attr\{T) or 

u G Va and 3{v,u) £ E : w £ Attr\{T) or 

V £ Vp and V(u,m) £ E : w £ AttPjffT). 



The P-attractor of T, denoted Attrp{T), is defined in the analogous way. 
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By induction on m = |c(Q)|, i.e., the number of colors in the weak automaton, 
we show that either of the players has a positional winning strategy. For m = 1 
every play in a is won by Automaton or every play is won by Pathfinder. 
Therefore Automaton or Pathfinder has a positional winning strategy. 

Let m > 2, k = min(w(P)), and Vk = {v G V \ w{v) = k}. We assume that 
k is even. The proof for the other case is analogous. 

If {qo,0) belongs to AttrA{Vk), then obviously Automaton has a positional 
winning strategy in G^^a- If {qo, 0) does not belong to AttrA{Vk), then we define 
the game G'^ „ by removing the vertices of AttrAiVk) from GA,a- By induction 
we know that in G'a „ either Automaton or Pathfinder has a positional winning 
strategy. If Pathfinder has a winning strategy in G'a then playing according to 
this strategy also forces the game to stay outside of AttrA{Vk) in the game GA,a- 
Otherwise there would be a vertex belonging to AttrA{Vk) in G'a a- Therefore 
Pathfinder also has a positional winning strategy in Ga,oi ■ 

Now suppose Automaton has a positional winning strategy in G'a If Au- 
tomaton plays according to this strategy in Ga,oi then the only possibility for 
Pathfinder to give the play another progression as in G'a q., is to move into 
AttrA{Vk) if possible. But then Automaton wins by forcing the game to move 
into Vfc. Therefore Automaton has a positional winning strategy in GA,a- 



Lemma 3. Let A = {Q, U, qo, (5, c) be an alternating weak parity automaton and 
let a G ■ Automaton has a positional winning strategy in Ga,ci iff Pathfinder 
has a positional winning strategy in 

Proof. Let be a positional winning strategy for Automaton in Ga.u and let 
{q, S, 1) be a vertex of Pathfinder in Gj^. If there exists a play such that {q, S, 1) 
appears in this play, then S G vertices that may not appear 

in a play we do not have to define the strategy. From Remark E it follows that 
Sr\fA{q, ^ — 1) yf 0. Let p G Sr\fA{q, I — l). We define /p(g, S', 1) = p. For a play 
7 of played according to fp there exists a play 7' of Ga.o played according 
to fA such that w{"/) = Since Automaton wins 7' in Ga.o^ Pathfinder 

wins 7 in G^^. 

Let fp be a positional winning strategy for Pathfinder in G^^ and let (q,l) 
be a vertex of Automaton in Ga.q- The set S = {fp{q, R,l) \ R G a(z))^ 
a model of 6{q, a{l)) by Remark[D Let S' C S be a minimal model of S{q, a{l)). 
We define fA{q, 1) = {q, S' ,1 + 1). Again, for a play 7 in Ga,o played according to 
fA there exists a play 7' played according to fp in Gj^ such that w{^) = w{ff). 
Since Pathfinder wins 7' in G ~ Automaton wins 7 in GA,a- 



Theorem 1. Let A be a alternating weak parity automaton over the alphabet 
S. Then L{A) = E‘^\L{A). 
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Proof. Let a G 27“ . The automaton A accepts a iff Automaton has a positional 
winning strategy in by Lemma ^ By Lemma 0 this is equivalent to Pa- 

thfinder having a winning strategy in G~^. With Lemma 0 we know that this 
is the same as Automaton having no positional winning strategy in G^^ and 
again using Lemma 0 this is equivalent to a ^ L{A). 



4 Expressive Completeness 



In this section we show that every regular w-language can be recognized by 
an alternating weak parity automaton. We give a transformation of determini- 
stic parity automata into alternating weak parity automata; it seems to be the 
simplest way to establish expressive completeness of alternating weak parity au- 
tomata. (It is well known that deterministic parity automata recognize precisely 
the regular ^-languages, see [tThoflTj V In contrast to deterministic parity auto- 
mata, where it is not possible to bound the number of colors, for alternating 
weak parity automata it suffices to consider automata with only three colors. 



Theorem 2. For every deterministic parity automaton A = {Q, S, go, d, c) with 
IQI = n and c : Q ^ {!,..., m} one can construct an equivalent alternating 
weak parity automaton A' = (Q', 27, gg, 5', c') with \Q'\ = {m+ l)n and c' : Q' ^ 
{1,2,3}. 

Proof. We can assume that m = 2k for some k. Define A' as follows. 

— Let Q' = Q U (Q X (1, . . . , fcj X (0, 1}) and gg = go- 

— For g G Q, i G (1, . . . , /c|, a G 27, and p = (5(g, a) define 



S'(q,a) =pV \f (p,j,0). 



i=i 



<5'((g,'j,0),a) 
6'{{q,i, l),a) 



f _L if c(g) < 2i, 

\ (p, i, 0) A (p, z, 1) otherwise, 
I T if c(g) = 2z, 

} (p, z, 1) otherwise. 



— For q G Q and z G (1, . . . , A:} let c'(g) = 3, c'((g, z, 0)) = 2, c'((g, i, 1)) = 1. 

The idea is to guess the accepting color and the point from where on no smaller 
color occurs, and then to verify that the guessed color occurs infinitely often and 
no smaller color occurs anymore. The correctness proof is omitted. 



Note that along each path through a run of A' the colors are decreasing, which 
corresponds to the usual definition of weak automata. In fact our model without 
this restriction is equivalent to the usual model. 

It is also possible to start from nondeterministic Biichi automata instead of 
deterministic parity automata, using a similar construction as in mm- 
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5 Alternating Antomata and MSO 



In this section we give a characterization of alternating weak parity automata in 
MSO logic. We obtain a normal form of MSO formulas different from the EMSO 
formulas obtained by Biichi |Ijucti2| . 

The MSO formulas we consider are SIS (second-order theory of one succes- 
sor) formulas, built up in the usual way from x,y, z,xi,X 2 , ■ ■ ■ as (first-order) 
variables for natural numbers, X, Y, Z, Xi, X 2 , ... as (second-order) variables for 
sets of natural numbers, the symbols 0, -1-1, =, <, G with their usual meanings, 
the connectives A, V, — >•, -H-, and the quantifiers 3,V. 

We can interpret the sets of natural numbers as predicates, and abbreviate 
X £ X with Xx. We also use natural abbreviations as <, >, >, +2, -|-3, . . . and 
3“ (“there exist infinitely many”), V“ (“for almost all”). 

The formulas are interpreted in the structure (IN, <, -|-1, 0). As models we 
take tuples of subsets of N. Such a tuple a = (Qi, • • ■ Qn), with Qi, . . . , Qn ^ IN, 
is a model of a formula </>(A'i, . . . X„), denoted by a |= (j){Xi, . . . Xn), if and only 
if (j) evaluates to true when we substitute each Xi by Qi. 

To characterize automata in SIS, we use a correspondence between infinite 
words a over an alphabet and the models a of SIS. Then we will construct 
a formula such that the automaton A accepts a word a if and only if the 
corresponding tuple a is a model of In addition we construct a second 
formula (j)A equivalent to such that ~'4’A = 4’^- The connection between 
these two formulas corresponds to the connection between an automaton and its 
dual. 

To code w- words by sets of natural numbers, we assume without loss of 
generality that E = {0,1}^. With this convention a word a £ consists of 
k words ai,...,afe £ {0, 1}“’. We code ai with the set Xi, where x £ Xi iii 
ai(x) = 1. Then the tuple a = {Xi, . . . ,Xk) is a unique coding of a. We will 
refer to natural numbers as “positions”. Now we can express the fact that a 
word has a certain letter a = {a\, . . . ,ak) at the position x by the formula 
POSa(a;,Wi,...,Xfc) = f\i^o(a) ^ where 0(a) = {i\a^ = 1} 

and Z{a) = {i\ai = 0}. 

Let A = (Q,E,q(f,5,c) be an alternating weak parity automaton with c : 
Q -£ {0, . . . , fc} and without T and T in the transition function (this can be 
obtained by adding at most one extra state). We have to code the runs of an 
automaton on a word with subsets of IN. Let Q = {1, . . . , n} with go = 1 and 
let m = n -I- . We code a level I of a run and the edges to the previous level 

with a vector V[ £ {0, 1}™. The first n entries code the active states. This means 
entry j is 1 iff (z, 1) is a vertex of the run. For every i £ {1, . . .n\ the entries from 
i ■ n + 1 to i ■ n + n code the successors of the vertex (z, I — 1). This means entry 
z • n -I- g is 1 iff (j, 1) is a successor of (z, Z — 1). This idea is illustrated in Figure 
El for the beginning of a run of an automaton with states {goj <Zii 92 , 93 }- 

This coding yields an infinite sequence vq,vi,... of vectors from {0, 1}™ 
which can also be represented by Yi , . . . , Wn C IN in the same way as the words 



530 C. Loding and W. Thomas 




Fig. 2. Coding of the beginning of a run 



from 17“ . It is easy to verify that the first-order formula 

m 
2 = 2 

Th Tt Tt 71/ 

A Va; A (a; > 0 A y*a; ( V Yj.n+^x)) A A ( A -I- 1)) 

A A (r^xAPOSa(x,Xl,...,Xk) 

(i,a)€Qxi: 

V (A ^jX + 1 A ^i-n+jX -h 1 A A ''’^i-n+jX + 1)) 

seMi(i,a) Jes jes j^s 

is satisfied iff 1^, . . . , code a run of A on the input coded hy Xi, ... ,Xk- In 
the same way we can construct a formula DUALRUN^(Xi, . . . , Xk, Fi, . . . , Ym) 
that defines the runs of the dual automaton of A. 

To express the acceptance of a word by an automaton we now have to code 
the paths of a run. A path through a run G = (V,E) can be viewed as a 
subgraph G' = {V , E') of G {V C V and E' C E) such that every vertex has 
exactly one successor and every vertex except the initial vertex has exactly one 
predecessor. We can code a path by Fi, . . . C IM. These Zi must have the 
following properties (with i and j ranging over n}): 

— Zi CYi for every i {V' C V), 

— Zi n Zj = IJ) for every t A J (ici every level is at most one vertex) , 

— \/x3i{x G Zi) (in every level is at least one vertex), 

— for all x with ZiX and ZjX + 1 one has Yi.„+jX + 1 (the vertex in level cc + 1 
is a successor of the vertex in level x) , 

and we indicate by PATH_ 4 (Fi, . . . , Ym, Zi, . . . , Z„) a first-order formula expres- 
sing this. 

The last fact we have to express is that a path satisfies the acceptance con- 
dition. We define for m G {0, . . . , k} the set Qm of states with color m, i.e., 
Qm = {<Z G Q I c{q) = m}. Let Zi, . . . , C IN be the coding of a path through 
a run. Then the first-order formula 

k 

WEAKACC^(Zi,...,Z„) = Y (3x( \/ Z,x) A^x{ f\ -Z,x)) 

m=0 iGQm ig [J Qi 

l<.m 
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is satisfied if and only if the path coded by Z\, . . . ,Zn satisfies the acceptance 
condition of A. 

Now we can translate an automaton into two equivalent SIS formulas. Here 
we write X for Xi, . . . , Xk (and similar Y and Z). 

Theorem 3. The following two formulas are satisfied if and only if X is the 
coding of a word a S L(A). 

cI)a(X) = 3YVZ{mJNA(X,Y) A (PATH_4(F,Z) ^ WEAKACC^(Z))). 

= VF3Z(DUALRUN^(X,F) ^ (PATH_4(F,F) A WEAKACC^(F))). 

The formulas (f>A thus represent a normal form for SlS-formulas of 

second-order quantifier prefix types E 2 , II 2 , respectively, in which the acceptance 
component WEAKACC only involves reachability conditions. 

6 The Landweber Hierarchy 

The classical characterization of regular sequence properties in the Landweber 
hierarchy |La,n69j uses deterministic automata with different acceptance condi- 
tions. As we show, the hierarchy can also be characterized in two different ways: 
first by alternating automata, equipped with weak acceptance conditions, and 
second by fixing the acceptance condition as weak parity, and modifying the 
mode of the transition function. The three different characterizations are shown 
in Figure 0 For notational simplicity we abbreviate the type of an automaton 
by the initial letters of its transition mode and its type of acceptance condi- 
tion. So for example UWCB denotes universal weak co-Biichi automata and DP 
denotes deterministic parity automata. If T is such an identifier, then C(T) de- 
notes the class of languages characterized by automata of type T. As explained 
in |lVlPf)2| ■ the language classes £(DB), £(DCB), and £(DB)n£(DCB) capture 
the (regular) “recurrence properties”, “persistence properties”, and “obligation 
properties” , respectively. So our results clarify their role in the framework of 
alternating automata. 



Deterministic Automata 


Alternating Automata 


Weak Parity Automata 


£(DP) 


£(AWP) 


£(AWP) 


/ \ 


/ \ 




£(DB) £(DCB) 


£(AWCB) £(AWB) 


£(UWP) £(NWP) 


\ / 


\ / 




£(DB) n£(DCB) 


£(AWCB) n£(AWB) 


£(DWP) 



Fig. 3. Three different characterizations of the Landweber hierarchy 



Before we give the theorem stating the correctness of the characterization 
from Figure 0 we need some preparations. 
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Lemma 4. For every deterministic Biichi automaton one can construct an equi- 
valent universal weak co-Buchi automaton. 

Proof. For the transformation we use a simplified version of the construction 
from Sectional 

In |M H 84| Miyano and Hayashi give an exponential transformation of alternating 
Biichi automata into nondeterministic Biichi automata. But the construction 
also gives a more general theorem, which is stated below. 

Theorem 4. Let A be an alternating Biichi (weak Biichi) automaton. One can 
construct an equivalent nondeterministic Biichi (weak Biichi) automaton A' and 
furthermore, if A is universal, then A' is deterministic. 

To apply this theorem to weak parity automata, we give a transformation of 
weak parity automata into Biichi automata. The idea is to remember the lowest 
color seen so far. This suffices to decide with a Biichi condition, whether a path 
is accepting or not. 

Lemma 5. Let A = {Q, S, qo, S, c) be a weak parity automaton with c : Q ^ C. 
One can construct an equivalent Biichi automaton A' = (Q', U, q'g, S', F') with 
the same mode of transition function as A. 

Theorem 5. (i) £(DP) = £(AWP). 

(S) £(DB) = £(AWCB) = £(UWP). 

(3) £(DCB) = £(AWB) = £(NWP). 

(4) £(DB) n £(DCB) = £(AWB) n £(AWCB) = £(DWP). 

Proof. (1): This follows from Theorem |2| and from the fact that every language 
recognized by an alternating weak parity automaton is regular. 

(2): From Lemma 0 follows £(DB) C £(AWCB) and £(DB) C £(UWP). 
Since one can transform every universal weak parity automaton into a universal 
Biichi automaton by Lemma 0 and then into a deterministic Biichi automaton 
by Theorem 0 we get £(UWP) C £(DB). 

If we are given an alternating weak co-Biichi automaton, we dualize it, yiel- 
ding an alternating weak Biichi automaton. This can be transformed into a 
nondeterministic weak Biichi automaton by Theorem 0 If we dualize again, we 
get a universal weak co-Biichi automaton, equivalent to the given automaton. 
Therefore we get £(AWCB) C £(UWP), since weak co-Biichi conditions are 
special cases of weak parity conditions. 

We omit the proof of (3) because its the dual statement to (2), and (4) can 
be shown very easily, using (2) and (3), and some criteria on the loop structure 
of deterministic w-automata [l;antit)j . 



7 The PLTL Fragment 

In the previous section we gave exact characterizations for the different fragments 
from the Landweber hierarchy. Another fragment of the regular w-languages is 
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the fragment of PLTL (propositional linear temporal logic, see jEmefiOj ) that 



includes all languages that can be described by PLTL formulas, or, equivalently, 
by first-order formulas. PLTL formulas are built up from a finite set P of atomic 
propositions, the boolean operators, and the temporal operators Q (Next), □ 
(Always), O (Eventually) , (Until). For this section we fix the alphabet E = 2^ . 
Let a G E‘^ , j S IN, p G P, and p, (p' be PLTL formulas. The relation |= is defined 
as follows. 



For the boolean operators, |= is defined in the straightforward way. 

A PLTL formula (p defines the language L{p) = {a G | a,0 ^ p}. The 
PLTL fragment of the regular w-languages is the class of all languages that can 
be defined by a PLTL formula. 

In this section we give an exact automata theoretic characterization of this 
fragment in terms of a subclass of alternating weak parity automata, so called 
alternating linear automata. 

An alternating weak parity automaton A = (Q, U, gg, c) is called a linear 
automaton, if in the transition graph there are no cycles containing 2 or more 
states, and if along each path through the transition graph the colors of the 
states do not increase. 

A simple induction shows that a PLTL-definable language can also be reco- 
gnized by an alternating linear automaton (see e.g. Here we show the 

other direction, which was independently shown by Rohde jb,ohfi7j . 

Theorem 6. Let A = {Q, E,9 q,S,c) be an alternating linear automaton. There 
is a PLTL formula x such that L(A) = L{x). 

Proof. For 6 G B^{Q) let A{0) = {Q,S,9,5,c) (an automaton with an initial 
formula instead of an initial state). For P C P let = ApeflP ^ f\p^R ~'P- 

We construct for every 9 G B^(Q) a PLTL formula y(0) with L(x(9)) = 
L{A{9)) and then set x = x(^o)- Let 9 G B^{Q). If for all g G Q that occur in 9 
the formula x(?) is already known, then we obtain x(^) by replacing each atom 
g in 0 by the formula xi^)- Furthermore we set x(T) = T and x(-L) = -L. 

Let q G Q and let Tr{q) denote the set of states q' such that a transition 
leads from q to q' . Since ^ is a linear automaton, we can assume by induction 
that for all q' G Tr{q) \ {g} the formula x(A) is already known, and therefore 
also all x(^) with 9 G P+(Tr(g) \ {g}) are known. 

For all P C P we can write the transition formula for g and P in the form 



S{q,R) = {qA9R^q) V 9'ii g with 9R^q,9'j^ g G B+{Tr{q) \ {g}). We define 



a,i \= p if p G a{i), 

iia,i + l\=p, 

a,i \= Up if Vfc > j (a, k\= p), 

a,i\= Op if3k > i (a, k \= p), 

a,i \= pBp' if3k > i (a, k \= p' and \/i < j < k {a,j ^ p)). 




if c(g) is even, 
V Upq if c(g) is odd. 
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with 



‘Pq= y (V’fl A Ox{^R,q)) and V?, = \/ {ipR A Ox(0'R,q))- 

RCP RCP 



To show that for all 9 G B^{Q) the equality T(x(6*)) = L{A{9)) holds, it 
suffices to show L{x{q)) = L{A{q)) for all q £ Q. 

Let q £ Q and let a £ L(x(q))- If c(q) is odd, then there exists a fc G IN 
such that a,i \= for all i < /c and a,k \= A^'q- Thus, for all i < k, the word 
a[i + 1, oo) is in L(x{(^a{i),q)) and a[k + 1, oo) is in L{x{()a(k) <?))• induction 
we know that L(x(6'„(i),J) = L(^(6»„(i),,) and L(x(6'a(fc),g)) = L{A^a(k).q)- ^n 
accepting run of A{q) has the following form: 



O:(0) Q:(l) 







a.(k—l) 

Q 



oc{k) 






— l),g) 



The identifiers of the automata stand for accepting runs of these automata 
on the corresponding suffix of a. 

If c{q) is even then the proof is analogous accept for the case that the □(/?g 
part is satisfied. Then we get a run of Aq with an infinite path labelled with q. 
This run is also accepting, because c{q) is even. 

For the other direction let a £ L(Aq). If c(q) is odd, then an accepting run of 
Aq on a is of the form as given above. But then, using the induction hypothesis, 
a ^ ipqULp'^. In case c{q) is even, we can also get an accepting run with an infinite 
path labelled with g, but then a ^ 
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Abstract. We address the problem of integrating information coming 
from different sources. The information consists of facts that a central ser- 
ver collects and tries to combine using (a) a set of logical rules, i.e. a logic 
program, and (b) a hypothesis representing the server’s own estimates. 
In such a setting incomplete information from a source or contradictory 
information from different sources necessitate the use of many-valued lo- 
gics in which programs can be evaluated and hypotheses can be tested. 
To carry out such activities we propose a formal framework based on 
Belnap’s four-valued logic. In this framework we work with the class of 
programs defined by Fitting and we develop a theory for information in- 
tegration. We also establish an intuitively appealing connection between 
our hypothesis testing mechanism on the one hand, and the well-founded 
semantics and Kripke-Kleene semantics of Datalog programs with nega- 
tion, on the other hand. 

Keywords : deductive databases and knowledge bases, information in- 
tegration, logics of knowledge, inconsistency, four-valued logic. 



1 Introduction 

In several information oriented activities there is a need for combining (or “in- 
tegrating”) information coming from different sources. 

A typical example of such information-oriented activity is building a data 
warehouse, i.e. a special kind of very large database for decision-making support 
in big enterprises Q. The information stored in a data warehouse is obtained 
from queries to operational databases internal to the enterprise, and from remote 
information sources external to the enterprise accessed through the Internet. The 
answers to all such queries are then combined (by the so-called “integrator”) to 
derive the information to be stored in the data warehouse. 

The basic pattern of the data warehouse paradigm, i.e. collection of informa- 
tion then integration, is encountered in many different situations. What changes 
usually from one situation to another is the type (and volume) of the collected 
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information and the means used for the integration. 

In this paper we address a specific problem of information integration, na- 
mely, the information consists of facts that a central server collects from a num- 
ber of autonomous sources and then tries to combine them using: 

— a set of logical rules, i.e. a logic program, and 

— a hypothesis, representing the server’s own estimates. 

In such a setting incomplete information from a source or contradictory infor- 
mation coming from different sources necessitate the use of many-valued logics, 
in which programs can be evaluated and hypotheses can be tested. Let us see a 
simple example. 

Example 1. Consider a legal case where a judge (the “central server”) has to 
decide whether to charge a person named John accused of murder. To do so, the 
judge first collects facts from two different sources: the public prosecutor and 
the person’s lawyer. The judge then combines the collected facts using a set of 
rules in order to reach a decision. For the sake of our example let us suppose 
that the judge has collected a set of facts F that he combines using a set of rules 
R as follows: 



F 



witness(John) friends(John, Ted) 
false true 



{ suspect(X) motive(X) V witness(X) 
innocent(X) ^ 3Y (alibi(X, Y) A -ifriends(Jf, F)) 
friends (X, F) friends (F, X) V (friends(X, Z) A friends (Z, F)) 

charge(X) ^ suspect(X) © -iinnocent(X) 

The first fact of F says that there is no witness, i.e. the fact witness( Jo/m) 
is false. The second fact of F says that Ted is a friend of John, i.e. the fact 
friends( Jo/m, Ted) is true. 

Turning now to the set of rules, the first rule of R describes how the prosecutor 
works: in order to support the claim that a person is a suspect, the prosecutor 
tries to provide a motive and/or a witness. 

The second rule of R describes how the lawyer works: in order to support the 
claim that X is innocent, the lawyer tries to provide an alibi for X hy & person 
who is not a friend of X. This rule depends on the third rule which defines the 
relation friends. 

Finally, the fourth rule of R is the “decision making rule” and describes how 
the judge works: in order to reach a decision as to whether to charge X, the judge 
examines the premises suspect(X) and ~<innocent{X) . As explained earlier, the 
values of these premises come from two different sources: the prosecutor and the 
lawyer. Each of these premises can have the value true or false. However, it is 
also possible that the value of a premiss is undefined. For example, if a motive is 
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not known and a witness has not been found, then the value of suspect (X) will 
be undefined. 

In view of these observations, the question is what value is appropriate to 
associate with charge (X). 

What we propose is to collect together the values of the premises suspect (X) 
and -iinnocent(X), and to consider the resulting set of values as the value of 
charge(X). This is precisely what the notation 

charge(X) ^ suspect(X) © -iinnocent(X) 

means, where © denotes the “collecting together” operation. 

It follows that there are four possible values for charge(X) : 0, {true}, {false} 
and {true, false}. We shall call these values : Underdefined, True, False and 
Overdefined, and we shall denote them by U, T, T and O, respectively. 

The value Under defined for a premiss means that the premiss is true or 
false but its actual value is currently unknown. For the purpose of this paper we 
shall assume that any premiss whose value is not known is associated with the 
value Under defined. 

We note that the value Underdefined is related to the so-called “null values” 
of attributes in database theory. In database theory, however, a distinction is 
made between two types of null values m- 

— the attribute value exists but is currently unknown 

— the attribute value does not exist 

An example of the first type is the Department-value for an employee that 
has just been hired but has not yet been assigned to a specific department, and 
an example of the second type is the maiden name of a male employee. The value 
Underdefined corresponds to the first type of null value. 

Returning now to our example, the decision whether to charge John depends 
on the value that charge(John) will receive when collecting the values of the 
premises together. Looking at the facts of F and the rules of R (and using 
intuition) we can see that suspect(John) and innocent (John) both receive the 
value U and so then does charge(John). 

This is clearly a case where the judge cannot decide whether to actually 
charge John! 

In the context of decision making, however, one has to reach a decision (based 
on the available facts and rules) even if some values are not defined. This can be 
accomplished by assuming values for some or all underdefined premises. Such 
an assignment of values to underdefined premises is what we call a hypothesis. 

Thus in our example, if the judge assumes the innocence of John, then 
charge(John) receives the value false and John is not charged. We note that 
this is precisely what happens in real life under similar circumstances, i.e. the 
defendant is assumed innocent until proved guilty. 

Clearly, when hypothesizing on underdefined premises we would like our hy- 
pothesis to be “reasonable” in some sense, with respect to the available informa- 



Hypothesis Support for Information Integration in Four- Valued Logics 539 



tion, i.e., with respect to the given facts and rules. Roughly speaking, we define 
a hypothesis H to be “reasonable” or sound using the following test : 
calling a fact / defined under H if H{f) yf U, 

1. add H to F to produce a new set of facts F' = F U H; 

2. apply the rules of R to F' to produce a new assignment of values FI'-, 

3. if the facts defined under F[ are assigned to the same values in iJ' then H 
is sound, otherwise F[ is not sound. 

That is, if there is no fact of H that has changed value as a result of rule 
application then iJ is a sound hypothesis; otherwise FI is unsound. 

In our example, for instance, consider the following hypothesis: 



Hi 



innocent(John) charge(John) 

r r 



Applying the above test we find the following values for the facts of Hi : 



H'l 



innocent(John) charge(John) 

r F 



As we can see, the fact charge(John) has changed value, i.e. this fact had the 
value F in Hi and now has the value T in H'^. Therefore, Hi is not a sound 
hypothesis. 

Next, consider the following hypothesis: 



H2 



innocent(John) charge(John) 

r F 



Applying again our test we find : 



H'2 



innocent(John) charge(John) 

r F 



That is, the values of the facts of H 2 remain unchanged in , thus H 2 is a 
sound hypothesis. 

Intuitively, if our hypothesis is sound this means that what we have assumed 
is compatible with the given facts and rules. 

From now on let us denote V the facts of F together with the rules of R, i.e. 
P = {F, R), and let us call P a program. 

In principle, we may assume or hypothesize values for every possible ground 
atom. However, given a program P and a hypothesis H , we cannot expect H to 
be sound with respect to P, in general. What we can expect is that some “part” 
of H is sound with respect to P. 

More precisely, given two hypotheses H and H' , call H a part of H', denoted 
H < H' , if H{f) yf U implies H{f) = H'{f), i.e., if H agrees with H' on every 
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defined fact. It is then natural to ask, given program V and hypothesis H, what 
is the maximal part of H that is sound with respect to V. We call this maximal 
part the support of H by V, and we denote it by . Intuitively, the support of 
H indicates how much of H can be assumed safely, i.e., remaining compatible 
with the facts and rules of V. 

We show that the support Sp can be used to define a hypothesis-based se- 
mantics oiV — {F, R), denoted by semfp . This is done by a fixpoint computation 
that uses an immediate consequence operator T as follows: 

— Fo = F; 

~ ^i+l — 

We also show that there is an interesting connection between hypothesis based 
semantics and the semantics of Datalog programs with negation. More precisely, 
we show that if P is a Datalog program with negation then: 

— if i/ is the everywhere false hypothesis then sem^ coincides with the well- 
founded semantics of P HM, and 

— if i? is the everywhere underdefined hypothesis then semf^ coincides with 
the Kripke-Kleene semantics of P P). 

As we shall see, these results allow us to extend the well-founded semantics and 
the Kripke-Kleene semantics of Datalog program with negation to the broader 
class of Fitting programs jS). 

Motivation for this work comes from the area of knowledge acquisition, where 
contradictions may occur during the process of collecting knowledge from diffe- 
rent experts. Indeed, in multi-agent systems, different agents may give different 
answers to the same query. It is then important to be able to process the answers 
so as to extract the maximum of information on which the various agents agree, 
or to detect the items on which the agents give conflicting answers. 

Motivation also comes from the area of deductive databases. Updates leading 
to a certain degree of inconsistency should be allowed because inconsistency can 
lead to useful information, especially within the framework of distributed data- 
bases. In particular, Fuhr and Rolleke showed in that hypermedia retrieval 
requires the handling of inconsistent information. 

The remaining of the paper is organized as follows. In Section 2 we recall 
very briefly some definitions and notations from well-founded semantics, Bel- 
nap’s logic FOUR and Fitting programs. We then proceed, in Section 3, to 
define sound hypotheses and their support by a Fitting program P] we also 
discuss computational issues and we present algorithms for computing the sup- 
port of a hypothesis by a program P and the hypothesis-based semantics of P. 
In Section 4 we show that the notion of support actually unifies the notions 
of well-founded semantics and Kripke-Kleene semantics and extends them from 
Datalog program with negation to the broader class of Fitting programs. Sec- 
tion 5 contains concluding remarks and suggestions for further research. Proofs 
of theorems are omitted due to lack of space. 
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2 Preliminaries 

2.1 Three- Valued logics 

Well founded semantics Well-founded semantics of logic programs were first 
proposed in m- In the approach of HH an interpretation J is a set of ground 
literals that does not contain literals of the form A and -^A. Now, if we consider 
an instantiated program P defined as in HH , its well-founded semantics is defined 
using the following two operators on partial interpretations I : 

— the immediate consequence operator Tp, defined by 

Tp{I) = {head{r) \ r £ P A Vi? G body{r),B G /}, and 

— the unfounded operator [/p, where Up(I) is defined to be the greatest unfo- 
unded set with respect to the partial interpretation I. 

We recall that a set of instantiated atoms U is said to be unfounded with 
respect to / if for all instantiated atoms A G U and for all rules r G P the 
following holds: 



head{r) = A^ 3B G body{r) {-•B G I M B G U) 

In PI it is proven that Up{I) = 'HB \ SPFp{I), where 'HB is the Herbrand 
Base and SPFp(I) is the limit of the increasing sequence [5'PF*(i)]i>i defined 
by: 

— SPFp{I) = {head{r) \ r G P Apos{body{r)) = 0 

AWB G body{r),->B ^ 1} 

— SPFp'^{I) = {head{r) \ r G P Apos{body{r)) C SPFp{I) 

AVB G body{r),-yB ^ I},i> 0. 

The atoms of SPFp{I) are called potentially founded atoms. 

The operator Wp, called the well-founded operator, is then defined by Wp{I) 
= Tp{I) U -AJp{I) and is shown to be monotone with respect to set inclusion. 
The well-founded semantics of P is defined to be the least fixpoint of Wp HD- 



Kripke-Kleene semantics The Kripke-Kleene semantics was introduced in 
0. In the approach of 0, a valuation is a function from the Herbrand base to 
the set of logical values {true, false, unknown). Now, given an instantiated 
program V defined as in its Kripke-Kleene semantics is defined using an 
operator on valuations, defined as follows : given a ground atom A, 

— if there is a rule in V with head A, and the truth value of the body under v 
is true, then d>p{v){A) = true] 

— if there is a rule in V with head A, and for every rule in V with head A the 
truth value of the body under v is false, then <Pp{v){A) = false; 

— else d^p(y){A) = unknown. 
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2.2 Four- Valued Logics 

Belnap’s four- valued logic In |5], Belnap defines a logic called TOUTZ inten- 
ded to deal with incomplete and inconsistent information. Belnap’s logic uses 
four logical values that we shall denote by T, T, U and O , i.e. TOUTZ — {T, 
T, U, Oj. These values can be compared using two orderings, the knowledge 
ordering and the truth ordering. 

In the knowledge ordering, denoted by <j., the four values are ordered as 
follows: IZ T , U 7”, T <^, O, 'T < 1 ^ O. Intuitively, according to this 

ordering, each value of TOUTZ is seen as a possible knowledge that one can 
have about the truth of a given statement. More precisely, this knowledge is 
expressed as a set of classical truth values that hold for that statement. Thus, 
T is seen as {false}, T is seen as {true}, U is seen as 0 and O is seen as 
{false,true}. Following this viewpoint, the knowledge ordering is just the set 
inclusion ordering. 

In the truth ordering, denoted by <j, the four logical values are ordered as 
follows: T <ilA,T <t O, U <tT,0 T~. Intuitively, according to this ordering, 

each value of TOUTZ is seen as the degree of truth of a given statement. U and O 
are both less false than T, and less true than T, but U and O are not comparable. 

The two orderings are represented in the double Basse diagram of Figure 1. 




Fig. 1. The logic FOUR 



Both <t and <k give TOUTZ a lattice structure. Meet and join under the 
truth ordering are denoted by A and V, and they are natural generalizations of 
the usual notions of conjunction and disjunction. In particular, U/\0= T and 
UyO= T. Under the knowledge ordering, meet and join are denoted by 0 and 
©, and are called the consensus and gullibility, respectively: x®y represents the 
maximal information on which x and y agree, whereas x(By adds the knowledge 
represented by x to that represented by y. In particular, T®T= U and T®T= 
O. 

There is a natural notion of negation in the truth ordering denoted by 
and we have: ^ T= T , ^ T= T, U= U, -■ 0= O. There is a similar notion 
for the knowledge ordering, called conflation, denoted by -, and we have: -IA= 
O, -0=U,~ T= T, - T= T. 
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The operations V, A, restricted to the values T and T are those of classical 
logic, and if we add to these operations and values the value Wthen they are 
those of Kleene’s strong three- valued logic. 



Fitting programs Conventional logic programming has the set {T, T} as 
its intended space of truth values but since not every query may produce an 
answer partial models are often allowed (i.e. lA is added). If we want to deal with 
inconsistency as well then O must be added. Thus Fitting asserts that TOUTZ 
can be thought as the “home” of ordinary logic programming and extends the 
notion of logic program, as follows: 

Definition 1. (Fitting program) 

— A formula is an expression built up from literals and elements of TOIATZ, 
using A, V, ©, 3, V. 

— A elause is of the form P{xi, ...,Xn) < — 4>{xi, ...,Xn), where the atomie 
formula P{x\, ...,Xn) is the head, and the formula <f>{xi, ...,Xn) is the body. 
It is assumed that the free variables of the body are among x\, ...,Xn- 

— A program is a finite set of clauses with no predicate letter appearing in the 
head of more than one clause (this apparent restriction causes no loss of 
generality m)- 

We shall represent a Fitting program as a pair (F, R) where F is a function 
from the Herbrand base into POUR and R a set of clauses. This is possible 
because every fact can be seen as a rule of the form A ■<— u, where A is an atom 
and u is a value in POUR. 

A Datalog program with negation can be seen as a Fitting program whose 
underlying truth- value space is the subset {P,T,U} of POUR and which does 
not involve 0, ©, V, W, O, P. 



3 Hypothesis Testing 

In the remaining of this paper, in order to simplify the presentation, we assume 
that all Fitting programs are instantiated programs. Moreover, we use the term 
“program” to mean “Fitting program”, unless explicitly stated otherwise. 



3.1 Interpretations 

First, we introduce some terminology and notation that we shall use throughout 
the paper. Given a program V , call interpretation of V any function I over the 
Herbrand base TLB-p such that, for every atom A of TLB-p, I (A) is a value from 
POUR. 

Two interpretations I and J are compatible if, for every ground atom A, 
(/(A) yf U and J{A) ^U)^ I {A) = J{A). 
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An interpretation / is a part of an interpretation J, denoted / < J, if I (A) ^ 
U implies I {A) = J{A), for every ground atom A. Clearly, the part-of relation 
just defined is a partial ordering on the set V{TOU'TV) of all interpretations over 
TOUTZ. Given an interpretation /, we denote by def{I) the set of all ground 
atoms A such that I (A) U. Moreover, if S is any set of ground atoms, we 
define the restriction of / to S, denoted by 1/$ as follows: for all A € 'HB-p, 



I/s{A) 



r I{A) if A G 5, 
\U, otherwise. 



The operations of TOUTZ can be extended naturaly to V{TOUTZ) in the 
following way: I © J{A) = I (A) © J{A) and similarly for the other operations. 

The actions of interpretations can be extended from atoms to formulas as 
follows: 



- I{X AY) = I{X) A I{Y), and similarly for the other operators, 

- ^((3a;)(/)(a;)) =\/t=ciosedtermH4>(t)), and 

- /((Vx)(/)(x)) = At=cio.edterm-f(</>W^ 

If B is a closed formula then we say that B evaluates to the logical value a, 
with respect to an interpretation /, denoted hy B = a w.r.t. I or hy B =j a, if 
J{B) = a for any interpretation J such that I < J (i.e. if the value of B is equal 
to a with respect to the defined atoms of / whatever the values of underdefined 
atoms could be). There are formulas B in which underdefined atoms do not 
matter for the logical value that can be associated with B. For example let us 
take B = Ay C and let the interpretation / be defined by I {A) = U, I{C) = T; 
then no matter how A is interpreted B is evaluated to T, that is, B =/ T. 

Given an interpretation / let lo be the interpretation defined by : if /(A) yf U 
then Io{A) = /(A) else Io{A) = O, for every atom A. The following lemma 
provides a method of testing whether B =/ a, based on the interpretation lo- 

Lemma 1. Given a closed formula B, B =j a iff I{B) = a and Io{B) = a. 



3.2 The Support of a Hypothesis 

Given a program P = {F, R) we consider two ways of inferring information from 
V. First by activating the rules of R in order to derive new facts from those of 
F, through an immediate consequence operator T. Second, by a kind of default 
reasoning based on a given hypothesis. 

The immediate consequence operator T that we use takes as input the facts of 
F and returns an interpretation T{F), defined as follows: for all ground atoms A, 



Tr{F){A) = 



(aifA<— B gR and B =p a 
\U, otherwise 



What we call a hypothesis is actually just an interpretation iJ. However, we 
use the term “hypothesis” to stress the fact that the values assigned by H to 
the atoms of the Herbrand base are assumed values - and not values that have 




Hypothesis Support for Information Integration in Four- Valued Logics 545 



been computed using the facts and rules of the program. As such, a hypothesis 
H must be tested against the “sure” knowledge provided by V. The test consists 
of “adding” H to F, then activating the rules of V (using T) to derive an 
interpretation H' . li H < iL', then the hypothesis iJ is a sound one, i.e. the 
values defined by FI are not in contradiction with those defined by V . Hence the 
following definition: 

Definition 2 (Sound Hypothesis). Let V = {F,R) be a program and FI a 
hypothesis. FI is sound w.r.t. V if 

— F and H are compatible, and 

^ H/Head{v) < T{F © F[), where Head(V) = {A | BA ^ B & V}. 

We use the restriction of FI to HeadifP) before making the comparison with 
T{F(BH) because all atoms which are not head of any rule of V will be assigned 
to the value Underde fined by T{F(BH). Then H and T{F(BH) are compatible 
on these atoms. 

Even if a hypothesis H is not sound w.r.t. V, it may be that some part of 
H is sound w.r.t. V. Of course, we are interested to know what is the maximal 
part of H that is sound w.r.t. V. We shall call this maximal part the “support” 
of H. To see that the maximal part of H is unique (and thus that the support 
is a well-defined concept), we give the following lemma: 

Lemma 2. If Hi and H 2 are two sound parts of H w.r.t. V, then Hi © H 2 is 
sound w.r.t. V. 

Thus the maximal sound part of H is defined by \ H' < H and H' 

is sound w.r.t. V}. 

Definition 3 (Support). Let V be a program and H a hypothesis. The support 
of H w.r.t. V , denoted sfp , is the maximal sound part of H w.r.t. V (where 
maximality is understood w.r.t. the part- of ordering <). 

We now give an algorithm for computing the support s|( of a hypothesis H 
w.r.t. a program V. 

Consider the following sequence (PFi), i>0: 

— PFo = 0 ; 

— PFi+i = {A \ A ^ B G V and B ^ H{A) w.r.t. E © 
for all f > 0, 

The intuition here is that we want to evaluate step by step the atoms that 
could potentially have a logical value different than their values in H. We have 
the following results: 

Proposition 1. The sequence {PFf), i > Q is increasing with respect to set 
inclusion and it has a limit reached in a finite number of steps. This limit is 
denoted PF. 

Theorem 1. Let V and H be fixed. Then s|( = H 
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4 Hypothesis Based Semantics 

As we explained earlier, given a program P = (F,R), we derive information in 
two ways: by activating the rules (i.e. by applying the immediate consequence 
operator T) and by making a hypothesis H and computing its support w.r.t. 
V. In the whole, the information that we derive comes from T{F) (B . 

Proposition 2. The sequence (Fn), n > 0 defined by Fq = F and Fn+i = 
TniFn) © s^p is increasing with respect to <, so it has a limit denoted by 

semp . 

We recall that an interpretation / is a model of a program V if for every rule 
A^BofV, I{B) <t I{A). 

Proposition 3. The interpretation semp is a model ofV. 

This justifies the following definition of semantics for V. 

Definition 4 (H-semantics of V). The interpretation semp is defined to be 
the semantics of V w.r.t. F[ or the F[ -semantics ofP. 

Following this definition, any given program V can be associated with diffe- 
rent semantics, one for each possible hypothesis F[. Theorem 2 below asserts that 
this approach extends the usual semantics of Datalog programs with negation 
to a broader class of programs, namely the Fitting programs. 

Two remarks are in order here before stating Theorem 2. First, if we re- 
strict our attention to three values only, i.e. F, F and U, then our definition 
of interpretation is equivalent to the one used by Van Gelder et als CH , in 
the following sense: given an interpretation I following our definition, the set 
{A I I {A) = T} U {->A I I {A) = F} is a partial interpretation following |TT) : 
conversely, given a partial interpretation J following El, the function I defined 
by: I {A) = T if A G J, I {A) = F ii -lA G J, and /(A) = U otherwise, is an 
interpretation in our sense. 

Second, if we restrict our attention to Datalog programs with negation (re- 
call that the class of Fitting programs strictly contains the Datalog programs 
with negation) then the concept of sound interpretation for the everywhere false 
hypothesis reduces to that of unfounded set of Van Gelder et als mi- The dif- 
ference is that the definition in HH has rather a syntactic flavor, while ours has 
a semantic flavor. Moreover, our definition not only extends the concept of Un- 
founded set to four-valued logic, but also generalizes its definition to any given 
hypothesis H (not just the everywhere false hypothesis). 

Theorem 2. Let V be a Datalog programs with negation. 

1. If is the everywhere false hypothesis, then serrip^ coincides with the 
well-founded semantics ofV; 

2. If Hu is the everywhere underdefined hypothesis, then serrip'^ coincides with 
the Kripke-Kleene semantics ofV. 



Hypothesis Support for Information Integration in Four- Valued Logics 547 



5 Concluding Remarks 

We have defined a formal framework for information integration based on hypo- 
thesis testing. A basic concept of this framework is the support provided by a 
program V = {F, R) to a hypothesis F[. The support of H is the maximal part 
of FI that does not contradict the facts of F or the facts derived from F using 
the rules of R. 

We have then used the concept of support to define hypothesis-based se- 
mantics for the class of Fitting programs, and we have given an algorithm for 
computing these semantics. 

Finally, we have shown that our semantics extends the well-founded semantics 
and the Kripke-Kleene semantics to Belnap’s four-valued logic, and also generali- 
zes them in the following sense: if we restrict our attention to three-valued logics 
then for Fljr the everywhere false interpretation our semantics reduces to the 
well-founded semantics, and for the everywhere under defined interpretation 
our semantics reduces to the Kripke-Kleene semantics. 

We believe that hypothesis-based semantics can be useful not only in the 
context of information integration but also in the context of explanation-based 
systems. Indeed, assume that a given hypothesis FI turns out to be a part of the 
iJ-semantics of a program V. Then V can be seen as an “explanation” of the 
hypothesis F[. We are currently investigating several aspects of this explanation 
oriented viewpoint. 
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Masaccio*: 

A Formal Model for Embedded Components** 



Thomas A. Henzinger 
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Abstract. Masaccio Is a formal model for hybrid dynamical systems 
which are built from atomic discrete components (difference equations) 
and atomic continuous components (differential equations) by parallel 
and serial composition, arbitrarily nested. Each system component con- 
sists of an interface, which determines the possible ways of using the 
component, and a set of executions, which define the possible behaviors 
of the component in real time. 



We formally define a class of entities called “components.” The intended use of 
components is to provide a formal, structured model for software and hardware 
that interacts with a physical environment in real time. The model is formal 
in that it defines a component as a mathematical object, which can be ana- 
lyzed. The model is structured in that it permits the hierarchical definition of 
a component, and the hierarchy can be exploited for structuring the analysis. 
Components are built from atomic components using six operations: parallel 
composition, serial composition, renaming of variables (data), renaming of loca- 
tions (control), hiding of variables, and hiding of locations. There are two kinds 
of atomic components. An atomic discrete component is a difference equation 
which governs the instantaneous change of state. An atomic continuous compo- 
nent is a differential equation, which governs the evolutionary change of state 
over time. The mathematical semantics of a component is given by its inter- 
face and its set of executions. The interface of a component determines how the 
component can interact (be composed) with other components. Each execution 
specihes a possible behavior of the component as a sequence of instantaneous 
and evolutionary state changes. 

The interface of a component Data enters and exits a component through 
variables; control enters and exits through locations. All variables are assumed 
to be typed, with domains such as the booleans B, the nonnegative integers N, 
and the reals R. For each variable x, we assume that there is a primed version 
x' which has the same type as x. For a set V of variables, we denote by [V] the 
set of type-conforming value assignments to the variables in U : if a; G V and 
g € [y], then g(x) is the value assigned by g to x. The interface of a component 
A consists of five parts: 
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Fig. 1. A railroad crossing (safety: always [|®| < 100 => J/i = J/2 = 0]) 



— A finite set V™ of input variables. We write V™ for the set of primed 
variables whose unprimed versions are input variables. 

— A finite set of output variables. We require that the input and output 

variables are disjoint; that is, V™ fl ^4“* = 0 . We refer to the collection 
ym,out _ ym y y out input and output variables as I/O variables. The 

value assignments in are called I/O states. Given an I/O state q, 

we denote by q' the value assignment in ] which is derived from q in the 
following way: q'(x') = q(x) for all input variables x € V™. 

— A binary relation -<^1 C x 1/4“* of dependencies between I/O variables 

and output variables. The value of an output variable y can depend on 
previous values of any I/O variable x; intuitively, if x y, then the value 
of y can depend, without delay, also on the concurrent value of x. A set 
U of I/O variables is dependeney-elosed if for all x,y € if x y 

and y G U, then x G U. For example, the set V4" of input variables is 
dependency-closed. 

— A finite set of interfaee locations. These are the locations through which 
control can enter or exit the component A. 

— For each interface location a G , a predicate ‘Pa' ( a) on the variables in 

ym.out^ym given two I/O statesp and q, the entry eondition p™(a) 

is either true or false at (p, g'), i.e., if each unprimed variable x G ig 

assigned the value p(a;), and each primed variable y' G V™ is assigned the 
value q'{y'). Intuitively, if the current I/O state is p, and the input portion 
of the next I/O state is q' , then the component A can be entered at location 
a iff the entry condition PA^{a) is true at {p,q'). 

We will distinguish between discrete and hybrid components. If A is a discrete 
component, then all I/O variables of A have discrete types, such as B or N. 
Hybrid components have also I/O variables of type R. 

The executions of a component The possible finite behaviors of a component 
are called executions. Consider a component A. A jump of A is a pair (p, q) G 
^ g|-g^|-gg_ observation p is called the souree of the 

jump, and q is the sink. A flow of A is a pair { 6 , /) consisting of a positive real 
6 G R>o, and a function / : R — > from the reals to I/O states which 
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is differentiable^ on the compact interval [0, C R. The real 6 is called the 
duration of the flow, the observation /(O) is the source, and the observation f{8) 
is the sink. A step of A is either a jump or a flow of A. The step w is successive to 
the step V if the sink of v is equal to the source of w. An execution of A is either 
a pair (a,w) or a triple {a,w,b), where a,b G are interface locations and 

w = Wo ••• Wn a nonempty, finite sequence of steps of A such that (1) the first 
step Wo is a jump, and (2) each subsequent step tCi, for 1 < i < n, is successive 
to the immediately preceding step Wi-i. The location a is called the origin of 
the execution, the sequence w is the trace, and the location b (when present) 
is the destination. If A is a discrete component, then all traces of A consist of 
jumps only; the traces of hybrid components contain also flows. We write Ea for 
the set of executions of the component A. We require that Ea is prefix-closed, 
deadlock-free, and input-permissive. Prefix closure ensures that the executions 
of a component can be generated operationally in a stepwise manner. The set 
Ea of executions is prefix-closed if the following four conditions are satisfied: 

1. If {a,w,b) G Ea, then (a,w) G Ea- 

2. If (a, Wq - • • vJn) G Ea for n > 1, then (a, wq ■ • • vjn-i) G Ea- 

3. If (a, w ■ {6, f )) £ Ea for a flow {6, /), then (a, w ■ {e, f )) £ Ea for all reals 

£ G (0,^). 

4. If (a, (p, q)) £ Ea for a jump (p, q), then the entry condition p™(a) is true 
at (p,q'). 

Deadlock freedom ensures that the stepwise generation of executions cannot 
deadlock inside a component. The set Ea of executions is deadlock-free if the 
following two conditions are satisfied: 

1. For all interface locations a and I/O states p, if the entry condition p™(a) 
is true at {p,q') for some I/O state q, then (a, {p,q)) £ Ea for some jump 
(p, q). In other words, if the entry condition of location a is satisfiable at the 
I/O state p, then there is an execution with origin a and source p. 

® On types other than R, it can be assumed that only the constant functions are 
differentiable. 
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Fig. 3. The component Train 



2. If {a,w) G Ea, then either (a,w,b) G Ea for some interface location b, or 
{a,w ■ (p,q)) G Ea for some jump (p,q). In other words, every execution 
which does not have a destination can be prolonged by either a destination 
or a jump. 

Input permissiveness ensures that a component cannot constrain the behavior 
of input variables. The set Ea of executions is input-permissive if the following 
two conditions are satisfied: 

1. If (a,{p,qi)) G Ea for a jump {p,qi), then for every dependency-closed set 

U of I/O variables, and every I/O state q 2 such that (1) the I/O state (?2 
agrees with qi on the variables in U and (2) the entry condition is 

true at {p, q' 2 ), there is an execution (a, {p, q)) G Ea whose sink q agrees with 
(72 on the variables in U and the input variables. 

2. If (a, w ■ {p,qi)) G Ea for a nonempty trace w and a jump (p, gi), then for 
every dependency-closed set U of I/O variables, and every I/O state q 2 which 
agrees with qi on the variables in U, there is an execution (a, w ■ {p, q)) G Ea 
whose sink q agrees with (72 on the variables in U and the input variables. 

If two components A and B have the same interface, then they can take each 
other’s place in all contexts. We say that A refines (or implements) B if (1) the 
components A and B have the same interface and (2) every execution of A is 
also an execution of B; that is, Ea C Eb- If A rehnes B, then B can be thought 
of as a more abstract (permissive) version of A, with some details (constraints) 
left out in B which are spelt out in A. Since the executions of A are deadlock- 
free, if B has an execution with origin a, and A rehnes B, then A must also 
have an execution with origin a. Thus a component with a nonempty set of 
executions cannot be trivially implemented by a component with the empty set 
of executions. 

The parallel composition of components Two components A and B can he 
composed in parallel if their interfaces satisfy the following three conditions: 
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1. The output variables of A and B are disjoint; that is, n = 0. 

2. There is no inferred mutual dependency between an output variable of A and 
an output variable of B\ that is, there do not exist two variables x e 

and y G such that both x ~<*g y and y x, where is the transitive 
closure of the dependency relation -<. 

3. For each interface location a common to both A and B, the entry conditions 

of a are equivalent in A and B; that is, if a G n , then the entry 
condition (p^{a) is equivalent to the entry condition i^|j" (a). This implies, 
in particular, that (p^(a) does not constrain the primed outputs of B, nor 
does constrain the primed outputs of A. 

If the components A and B can be composed in parallel, then ri||i? is again a 
component. The interface of the component A\\B is defined from the interfaces 
of the subcomponents A and B\ 

— A variable is an input to A\\B if it is an input to A but not an output of B, 

or an input to B but not an output of A; that is, U 

— A variable is an output of A||i3 if it is an output of A or an output of B; 
that is, V^l^^g = ^ 4 °“* U 

— The dependencies of A\\B are inherited from both A and B; that is, ^a\\b 
= -<A U ~<g. 

— The interface locations of A||i? are the interface locations of A together with 

the interface locations of B; that is, U Vg*^ . 

— If a is an interface location of both subcomponents A and B, then they agree 

on the entry condition, and this is also the entry condition of A||i3; that is, 
if a G CiLg^f, then where 

x' G iff a; G yj" n y^“*, and y' G iS y G y“ n y^"*. It follows that 
the component A||i? can be entered at location a iff both subcomponents 
A and B can be entered concurrently at a. The quantifiers (whose force, 
existential or universal, is immaterial) ensure syntactically that no primed 
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output variables occur freely in entry conditions. All other interface locations 
of A\\B have the unsatisfiable entry condition; that is, if a G or 

a G then = false. These locations can be used only to 

exit the component A||i?. 

The executions of the component A\\B are defined from the executions of the 
subcomponents A and B: 

— The pair (a, w) is an execution of A\\B iff (a, w\a) is an execution of A and 
(a, mis) is an execution of B, where w\c is the restriction of the trace w to 
values for the I/O variables of the component C. 

— The triple (a, w, b) is an execution of A\ \B iff either {a,w\A, b) is an execution 
of A and (a, tu|s) is an execution of B, or (a,w\B,b) is an execution of B 
and (a,ii;|yi) is an execution of A. 

In other words, parallel composition acts conjunctively on traces. In particular, 
each jump of A corresponds to a concurrent jump of B, and each flow of A 
corresponds to a concurrent flow of B with the same duration. If an execution of 
A reaches a destination, then the concurrent execution of B is terminated; if B 
reaches a destination, then the concurrent execution of A is terminated; if both 
A and B simultaneously reach destinations, then one of the two destinations is 
chosen nondeterministically. Note that the operator 1 1 for parallel composition is 
associative and commutative. Furthermore, the refinement relation is preserved 
by parallel composition: if A and B are two components with the same interface, 
if A refines B, and if A (and therefore also B) can be composed in parallel with 
a component C, then A\\C refines B\\C. 

The serial composition of components Two components A and B can be 
composed in series if their interfaces agree on the output variables; that is. 
If the components A and B can be composed in series, then A + B 
is again a component. The interface of the component A + B is dehned from the 
interfaces of the subcomponents A and B: 

— A variable is an input to A + i? if it is an input to A or an input to B; that 

T/in _ T/m i i T/m 

IS, Va+B — ^ ■ 

— As A and B agree on their outputs, these are also the outputs oi A + B; that 

\rout '[rout irout 

IS, Va+B — Va — Vb ■ 

— The dependencies of A + i? are inherited from both A and B; that is, -<a+b 
= -<A U ~<B. 

— The interface locations of A + i? are the interface locations of A together 
with the interface locations of B] that is, L™^b ~ L^a*^ U . 

— If a is an interface location of both A and B, then the entry condition of a in 

A + B is the disjunction of the entry conditions of a in the subcomponents A 
and B] that is, if a G then (P^^b(^) ~ ® 

interface location of A but not of B, then the entry condition of a in A + i? 
is inherited from A; that is, if a G , then (p'a+b(^) ~ If ® 

is an interface location of B but not of A, then the entry condition of a in 
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A+B is inherited from B; that is, if a e then = ip^{a). 

This is because the component A + i? is entered at location a iff either 
subcomponent A or subcomponent B is entered at a. 



The executions of the component A + B are defined from the executions of the 
subcomponents A and B: 

— The pair {a,w) is an execution of A + B iff either (a,iu|yi) is an execution 
of A, or (a,m|s) is an execution of B. 

— The triple (a, w, b) is an execution oi A + B iff either (a, b) is an exe- 
cution of A, or {a,w\B,b) is an execution of B. 



In other words, serial composition acts disjunctively on traces. Note that the 
operator + for serial composition is associative, commutative, and idempotent. 
Furthermore, the rehnement relation is preserved by serial composition: if A 
and B are two components with the same interface, if A refines B, and if A (and 
therefore also B) can be composed in series with a component C, then A + C 
refines B + C. 



Variable renaming When constructing a parallel composition ^||B, inputs of 
A can be identified with outputs of B, and vice versa, by renaming variables. 
The variable x can be renamed to y in component A if a; is an I/O variable 
of A and y is different from all I/O variables of A; that is, a: € and 

y ^ jf X can be renamed to y in A, then A[x := y] is again a component. 

The interface of the component A\x := y] is defined from the interface of A: let 
VTi.:=y] = iVX-\{x})U{y}, let let , 

and let +A[x-.=y] and +^A[x-=y] result from renaming a; to y in Aa and in , 



respectively. The executions of the component A[x := y] result from renaming 
a; to 2/ in the traces of the executions of A. The refinement relation is preserved 
by the renaming of variables: if A and B are two components with the same 
interface, if A refines B, and if x can be renamed to y in A (and therefore also 
in B), then A\x := y] rehnes B[a; := y]. 
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Location renaming When constructing a serial composition A + _B, inter- 
face locations of A can be identified with interface locations of B by renaming 
locations. The location a can he renamed to b in component A if a is an in- 
terface location of A; that is, a G The location b may or may not be 

an interface location of A. If a can be renamed to b in A, then A\a b] is 
again a component. The interface of the component A[a := b] is defined from 
the interface oi A: let V;^|a:=b] = ’'^“7 1®* "^A[l:=b\ = let ^A[a-.=b] = ^A, 

let L^A[L=b] = U {b}, let ^T[a-.=b](b) = V>Aia) A b ^ let 

VT[a-.=b]ib) = VT(a) V ^Tib) if b G , and let pTla-.=b]i^) = Va(c) for 
all locations c G L™*f\{a, b}. Consequently, if both a and b are interface loca- 
tions of A, then the component A\a := b] can be entered at location b whenever 
the original component A can be entered at either a or b. The executions of the 
component A[a := b] result from renaming a to 5 in the origins and destinations 
of the executions of A. The refinement relation is preserved by the renaming of 
locations: if A and B are two components with the same interface, if A refines B, 
and if a can be renamed to 6 in Gl (and therefore also in B), then A\a := b] refines 
B[a:=b]. 



Variable hiding Hiding renders a variable local to a component, and invisible 
to the outside. Hidden variables do not maintain their values from one exit of a 
component to a subsequent entry, but they are nondeterministically reinitialized 
upon every entry to the component as to satisfy the applicable entry condition. 
The variable x can be hidden in the component H if a: is an output variable 
of A] that is, x G If x can be hidden in A, then H\a: is again a compo- 

nent. The interface of the component H\a; is defined from the interface of A: let 
let = V^'^*\{x}, let ^A\x be the intersection of the transitive 

intf 

A\x ~ ^A 1 



yin _ WI' 

'^AXx — ^A 
closure with 






let , and let = (3x) tp'^ia) 
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for all locations a € . In other words, upon entry of the component A\a; 

at location a, the output variable x has an unknown value which permits the 
satisfaction of the entry condition (p'^{a). The executions of the component kl\a; 
result from restricting the traces of the executions of A to values for the I/O 
variables of yf\a;. Note that the component A\x\y is identical to the component 
A\y\x. Furthermore, the refinement relation is preserved by the hiding of vari- 
ables: if A and B are two components with the same interface, if A refines B, 
and if x can be hidden in A (and therefore also in B), then A\a; refines B\x. 

Location hiding Hiding renders a location internal to a component, and inac- 
cessible from the outside. The location c can be hidden in the component A if 
c is an interface location of A and the entry condition is valid; that is, 

c G and is equivalent to true. Consequently, an interface location 

c of H can be hidden only if the component A cannot deadlock at c, no mat- 
ter what the current I/O state and the next inputs. If c can be hidden in A, 
then H\c is again a component. The interface of the component H\c is defined 
from the interface of A: let = V™, let HJ'"/ = 1 / 4 “*, let = -<a, let 

~ for all locations a G L™\c- 

executions of the component H\c are defined from the executions of A: 

— The pair (a, w) is an execution of H\c iff c ^ a and either (a, w) is an 
execution of A, or there is a finite sequence Wi,. . . , w„ of traces, n > 2, 
such that w — Wi ■ ■ ■ Wn and the following are all executions of A: the triple 
(a, mi,c), the triples {c,Wi,c) for all 1 < i < n, and the pair {c,Wn). 

— The triple (a, w, b) is an execution of H\c iff c ^ {a, b} and either (a, w, b) is 
an execution of A, or there is a finite sequence mi, ... , Wn of traces, n > 2, 
such that w = Wi ■ ■ ■ Wn and the following are all executions of A: the triple 
(a,mi,c), the triples {c,Wi,c) for all 1 < i < n, and the triple {c,Wn,b). 

In other words, the executions of H\c result from stringing together, at location c. 
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a finite number of executions of A. Note that the component A\c\d is identical 
to the component A\d\c. Furthermore, the refinement relation is preserved by 
the hiding of locations: if A and B are two components with the same interface, 
if A refines B, and if c can be hidden in A (and therefore also in B), then yl\c 
refines B\c. 

Atomic discrete components The discrete components are built from atomic 
discrete components using the six operations of parallel and serial composition, 
variable and location renaming, and variable and location hiding. Each atomie 
discrete eomponent is specified by a jump action. A jump aetion J consists of a 
finite set Xj of souree variables, a finite set Yj of uncontrolled sink variables, a 
finite set Zj of eontrolled sink variables disjoint from Yj, and a predicate 
on the variables in XjUYjUZj, where V is the set of primed versions of the vari- 
ables in V. The predicate is called jump predicate; it is typically written as 
a guarded difference equation. The jump action J specifies the component A{J). 
The interface of the component A{J) is defined as follows: 

— The inputs to A{J) are the source variables of J which are not controlled 
sink variables, together with the uncontrolled sink variables; that is, VX(j) ~ 
{Xj\Zj)UYj. 

— The outputs of A{J) are the controlled sink variables of J; that is, = 
Zj. 

— Each controlled sink variable depends on each uncontrolled sink variable; 

that is, for all x G and y G define x -<a(j) y iff a; G Fj and 

y G Zj. 

— The component A{J) has two interface locations, say, from and to; that is. 
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^A{j) = {from, to}. 

— The entry condition of from is the projection of the jump predicate to the 
source variables and the primed versions of the uncontrolled sink variables; 
that is, (/rom) = The entry condition of to is unsatisfi- 

able; that is, (f}f}j.^{to) = false. 

The executions of the component A{J) are defined as follows: the pair (a, w) 
is an execution of A{J) iff a = from and the trace w consists of a single jump 
(p, q) such that the jump predicate jg true if each source variable x G Xj is 

assigned the value p{x), and each primed sink variable if G YfUZf is assigned the 
value q{y). Moreover, the triple (a, w, h) is an execution of A{.J) iff the pair (a, w) 
is an execution of A{J), and b = to. In other words, the traces of the atomic 
discrete component A{J) are the jumps that satisfy the jump predicate 
From any given source, there may be no such jumps or there may be several. 
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Atomic continuous components The hybrid components are built from both 
atomic discrete components and atomic continuous components using the six 
operations on components. Each atomic continuous component is specified by 
a flow action. A flow aetion F consists of a finite set Xp of source variables, a 
finite set Yp of uncontrolled flow variables of type R, a finite set Zp of eontrolled 
flow variables of type R disjoint from Zp, and a predicate on the variables 
in Xp U IV U Zp, where V is the set of dotted versions of the variables in V. We 
use the notation V only if all variables in V have type R, with the intent that 
the dotted variable x gV represents the first derivative of a; G V. The predicate 
is called flow predicate-, it is typically written as a guarded differential 
equation. The flow action F specifies the component A{F). The interfaee of the 
component A{F) is defined as follows: 

— The inputs to A{F) are the source variables of F which are not controlled 
flow variables, together with the uncontrolled flow variables; that is, Va(f) ~ 
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(Xf\Zf)UYf. 

— The outputs of A{F) are the controlled flow variables of F; that is, = 

Zf. 

— Each controlled flow variable depends on each uncontrolled flow variable; 

that is, for all x G and y G VX{f)’ define x Fa(f) y iS x G Yf and 

y G Zf- 

— The component A(E) has two interface locations, say, from and to; that is, 

^7(f) = to}. 

— The entry conditions of from and to are unsatisfiable; that is, ^}f}F) if'^om) = 
<-p}ff^F){to) = false. This ensures that jumps take precedence over flows, in 
the sense that if a component A wishes to jump and concurrently another 
component B wishes to flow, then the parallel composition A\\B will jump. 

The executions of the component A{F) are defined as follows: the pair (a, w) is an 
execution of A{F) iff a = from and the trace w consists of a single flow {6, /) such 
that for all reals £ e [0, 5], the flow predicate is true if each source variable 
X G Xf is assigned the value f{e){x), and each dotted flow variable y gYfA Zf 
is assigned the value f'{e){y), where /' is the first derivative of /. Moreover, the 
triple (a, w, b) is an execution of A{F) iff the pair (a, w) is an execution of A{F), 
and b — to. In other words, the traces of the atomic continuous component A{F) 
are the flows that at all times satisfy the flow predicate . From any given 
source and duration, there may be no such flows or there may be several. If there 
is a flow of a given duration, then there is a flow for each shorter duration as 
well. 

Example The Figures 1-10 illustrate parts of a component which models the 
control of a railway crossing. In the figures we use the following conventions. 
Components are represented by rectangles. Input and output variables are rep- 
resented, repectively, by arrows to and from component boundaries. Locations 
are represented by little black disks, and between locations, jump actions are 
represented by arrows with solid (black) points, and flow actions are represented 
by arrows with hollow (white) points. Interface locations are drawn on compo- 
nent boundaries. Variables which are identified by renaming are connected by 
solid lines; locations which are identified by renaming are connected by dotted 
lines. The event type E is similar to the boolean type B, except that if a variable 
X has type E, then it is of interest when the value of x changes (from true to 
false, or vice versa) whereas the actual value of x at any time is irrelevant. If x 
has type E, then we write a;! for x' := ~^x (to issue an event x), and xl for a/ ^ x 
(to query the presence of an event x). Instead of using jump and flow predicates, 
we annotate jump and flow actions with guarded commands, because they allow 
us to omit specifying that a variable is left unchanged. Specifically, by default, 
an omitted guard is true, an omitted list of assignments is empty, the default 
jump assignment is a;' := a;, and the default flow assignment is a; := 0. 

The component RailCrossing has three real outputs, the distance x of the 
train from the crossing, and the positions yi and p 2 of the two gates. The boolean 
input obstaele indicates whether or not the driver of the train sees an obstacle 
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on the crossing, in which case she will try to stop the train. The component 
RailCrossing is the parallel composition of three subcomponents, the train Train, 
the gate mechanics Gate, and the gate controller Control. We will look only into 
the component Train, which communicates with the gate controller via the out- 
put events approach and leave, and the input events stop and go (for example, 
if the gate fails, the gate controller may signal the train to stop). The com- 
ponent Train is the serial composition of four subcomponents: the component 
Far controls the speed x of the train when it is more than 1000 meters from 
the gate; an unnamed component issues the event approach when the train is 
at 1000 meters from the gate; the component Near controls the speed x of the 
train when it is between 1000 and —100 meters from the gate; and an unnamed 
component issues the event leave when the train is at —100 meters from the 
gate. The component Far holds the speed of the train between 40 and 50 meters 
per second. The component Near is the parallel composition of three subcom- 
ponents, Radio, Brake, and Engine. The component Radio translates stop and 
go events received from the gate controller into a boolean output remote, which 
causes the train to brake. The component Brake is an OR gate which computes 
the boolean disjunction brake of the two brake signals remote and local, where 
the latter is issued by the driver when she sees an obstacle. The component 
Engine controls the acceleration dx = x oi the train. It does so by switching 
between the component Drive, which accelerates the train to 50 meters per sec- 
ond, and the component Halt, which causes the train to stop. The switching 
between Drive and Halt is controlled by the boolean input brake, and occurs 
through the locations slowdown and speedup. No matter whether the train is 
accelerating or braking, as soon as it is 100 meters past the gate, the component 
Near relinquishes control. 
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Abstract. We present the complete lattice of demonic languages and 
its interpretation in refinement proofs. In contrast to the conventional 
approach of refinement with an abstraction relation on the underlying 
state spaces, we introduce a notion of refinement with an abstraction 
relation on the power sets of the state spaces. This allows us to derive a 
single complete refinement rule for demonic specifications. 



1 Introduction 

In |1 0| . a refinement semantics is presented for so-called demonic specifications 
that lifts the refinement of operations defined in formal specifiation languages 
such as Z and VDM to the refinement of state machines. The semantics is con- 
sistent with the conventional refinement semantics for the system underlying a 
specification in z im. Rather than using a system or state machine semantics 
via the subset ordering of prefix-closed languages m, an approach is taken 
where operations are interpreted as relations on state spaces, and input/output 
histories are used to define the semantics of the underlying state machine. This 
approach is not state dependent, as is the case for the improved failure model 
of CSP p|R] . Instead, the languages and specifications are restricted to so-called 
demonic ones, where the enabling on a new input is dependent only on the past 
input history independent of the past outputs. The refinement relation is not 
simply trace inclusion where traces may disappear during the refinement pro- 
cess; it requires the refined system to accept the same or more traces as the 
original one. As usual, refinement relations on the operational level are defined 
via abstraction relations between the underlying state spaces. In analogy to stan- 
dard simulation techniques [Ibibibll 111411,41 . it is then possible to express every 
refinement with the help of two refinement methods on the operational level, 
denoted by forward and backward refinement. 

A relational approach with abstraction relations as above will always neces- 
sitate two refinement rules for completeness. By using a predicate-transformer 
semantics, Gardiner and Morgan |2j obtain completeness of refinement by using 
a single refinement technique, called cosimulation (a predicate transformer with 
certain properties). 

J. van Leeuwen et al. (Eds.): IFIP TCS2000, LNCS 1872, pp. 564-^^^ 2000. 

© Springer- Verlag Berlin Heidelberg 2000 
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In this paper, we build on the relational approach of m to further investigate 
the completeness of refinement. As refinement semantics we will use the partial 
ordering on prefix-closed and demonic languages introduced in cni. We then 
extend the results from cni, by proving that this ordering defines a natural 
lattice structure on the prefix-closed and demonic languages, and discuss its 
importance for refinement proofs. This is in analogy to the subset ordering on 
languages and its interpretation in refinements I11I1I5I11I14I111I . Furthermore, we 
generalise the notion of refinement relations via abstraction relations on state 
spaces to refinement relations via abstraction relations on power sets of state 
spaces. As for the predicate transformer setting |2|, it is then possible to derive 
a single complete refinement rule. 

In Section 2 we introduce the notions of prefix-closed and demonic langua- 
ges, and demonic specifications from m- In Section 3, we present the lattice 
structure on these languages, and in Section 4 we introduce a notion of refine- 
ment on specifications and show its soundness and completeness. We conclude 
in Section 5 with a comparison with related work. 

2 Languages and Specifications 

We view a module as a black box, accessible only through a fixed set of operati- 
ons — the exported procedures and functions. A module interface specification 
(hereafter just specification) specifies the behaviour of the module. The syntax 
of the specification states the names of the access routines, and their inputs and 
outputs. We use Op to denote the set of all operation names. In to denote the 
set of all inputs, and Out to denote the set of all outputs. 

The semantics of the specification describes the observable behaviour of the 
operations. We are interested in comparing the behaviour of different specifica- 
tions. Because there are many ways to represent the state in a specification, we 
need a definition of behaviour that is independent of the state representation. We 
first consider histories (observable behaviours): finite, possibly empty sequences 
of the form 

h = {ci,Vi){c2, V 2 ) ... {Cn,Vn) 

For i € {!,..., n}, Ci = (t^, op^) is a call to an operation opi € Op with input 
Li G In, and Vi S Out is an output. If an operation has no input or output, we 
use the special symbol T to indicate this. We use the symbol e to denote the 
empty history. 

2.1 Languages 

The set of all histories, %, is determined by Op, In, and Out. A language C is 
defined as a subset of H. We only consider non-empty languages that are prefix- 
closed: for any history h £ C and any call-value pair {c,v), if h{c,v) € C then 
h&C. 



Canu = | £y^0A£is prefix-closed } 
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This implies e G £ for all languages in Canu- We introduce the following ope- 
rators on histories. For any history 

h = {ci,vi){c2, V 2 ) ... (Cn,V„) 

we denote the corresponding trace or input sequence by 



I{h) = C 1 C 2 . . . c„ 

As for histories, we use e to denote the empty trace and thus X{e) = e. For a set 
of histories H C 'H, we define the set of all traces of H by 

Tr{H) = {I{h) \ heH} 

For a language £ and a trace t G Tr{'H), we collect all possible histories with 
trace t in the set 

= {h \ h G C A T{h) = t} 

We will also use the following operators on finite sequences a = s\S 2 ---Sn'- 
front{a) = siS 2 ---Sn-i, last{u) = and a \ m = si...Sm- Finally, we use #(t to 
denote the length of a. 

In |1()| . we show that the class of demonic languages provides a natural 
semantics for data refinement in VDM and Z. Intuitively, a language £ is demonic 
if it is prefix-closed and the fact that an input is enabled depends solely on the 
past input history, independent of the corresponding outputs. 

Definition 1. A language £ is demonic if 

a) £ is prefix-closed. 

b) V /ii, /i2 G £ : I{hi) = I{Ji2) => 

V i G In, oj G Out, op G Op : 

op),uS) G £ ^ 3a;' G Out : /i 2 ((t, op), to') G £ 

The set of demonic languages will be denoted by 
Can^ — {ecu I C ^ 0 A C is demonic} 

For instance, deterministic languages (Vr G Tr{C) : ffOdr) = 1) and total 
languages (fJ h G C, l G In, op G Op3tu G Out : /i((t, op),u) G £) are demonic. 

A useful characterisation for demonic languages is the following: a language 
£ is demonic iff 

V T G Tr(£) \ {e} : O c{front{T)) = { /i t (#t - 1) | /i G Oc(t) } 

Unfortunately, the set of demonic languages Can^ does not behave as nicely 
as the set of prefix-closed languages Can-u, which forms a complete lattice un- 
der the inclusion ordering C and the usual set operations. In general, demonic 
languages are not closed under intersection and union. We will see below that 
Can^ carries a lattice structure under a different ordering relation. 
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We introduce a partial ordering d on languages. In Section 3, we use the 
poset (partially ordered set) 

(£on^, d) 

to define a lattice structure on Can^, and in Section 4, we use this lattice as a 
domain for the characterisation of refinement proofs. 

Definition 2. Let C and C he languages in H, 

C <sC ijf Tr{C) C Tr{C) A V t G Tr{C) : Qc'{t) C L2c{t) 

For languages £ and £', £' d £ if all traces in £ also occur in £', and if every 
history in £' corresponding to a trace in £ is also a history in £. 



2.2 Specifications 

A specification S defines a language £ — the subset of % expressing the be- 
haviour defined by the specification. In general the form of the specification 
may vary, but in this paper we focus on model-based specifications, where the 
behaviour is specified in terms of a state space St. 

Definition 3. A (model-based) specification S is a six-tuple 

{Op, St, In, Out, Init,_^) 

with operation (name) set Op, state set St, input set In, output set Out, a 
nonempty set of initial states Init C St, and an interpretation function 

: Op ^ P((/n X St) X {St x Out)) 

Note that Op, St, In, Out, Init can be infinite sets. Any operation op G Op is 
interpreted via as a set of pairs 

{{i,s),{s',u;)) 



where each pair represents a state transition with input l G In, internal states 
s,s' G St {s denotes the state before and s' the state after the operation is 
performed), and output u) G Out. 

For a specification S, the precondition of operation op G Op with input l & In 
will be denoted by 

pres{{i, op)) = {s G St \3s' G St,uj G Out : {{l, s), {s', co)) G op^ } 



We recursively define the postcondition of a trace t G Tr{'H) by 



ptraccs{t) 



Init if t is the empty trace 
{s' G b't I 3 s G 5't, w G Out : 

{{l, s), (s',uj)) G op^ A s G ptraces{ti)} 
if t = t\{i, op) 
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Note that in our setting, pre- and postconditions denote sets of states, not pre- 
dicates. Given a specification S and a history G 'H, we denote the set of final 
states of h by 

{ Init if h = e 

{s' £ St \ 3 s £ St : ((t, s), (s', oj)) £ op^ A s £ finals{hi)} 
if h = hi{{i, op),Lu) 

We can now define the language accepted by a specification S, consisting of 
the empty history and all histories that are produced by starting from an initial 
state in Init and recursively applying the operations from Op. 

Definition 4. For a specification S , the language accepted by S is 

Cs = {h£'H\h = e\/3hi£ Cs, op £ Op, i G In,uj £ Out : 

h = hi{{t, op),Lu) A (3 s G finals(hi), s' £ St : ((i, s), (s', w)) G op'®)} 

It follows from this definition that £s is prefix-closed (i.e., Cs S Can-u)- 

There is a notion corresponding to demonic languages for specifications. A 
specification S is demonic if whenever an input/operation pair (t, op) is enabled 
in a final state of a history h £ Cg it must be enabled in all the final states of 
the trace 1(h). Again, the input enabling depends only on the input history. 

Definition 5. A specification S is demonic if 

VtG Tr(£ 5 ) \ {e| : ptraceg (front (t)) C pres(last(r)) 

Every demonic specification S defines a demonic language Cs pa Prop. 2]. 
The converse is not true in general: there are non-demonic specifications that 
specify demonic languages. However, for every demonic language C there exists 
a demonic specification that defines C [Till Prop. 11]. 

As an example of a demonic specification, consider the following random 
number generating specification b'Al from m- 

There are two operations: random generates a random integer value and has 
no output, and val returns the value generated by the last call to random as its 
output. If no call to random has been made, val returns 0. 

Op = {random, val} , Out = Z U {T} , In = {T} 

= Z, /Af/T®^l = {0} 

random^^^ = {((T, s), (s', T)) | s, s' G Z| 
val^^^ = {((±,i),{i,i))\i£Z} 

By adding the operation two = {((T, 2), (2, 2))} to specification SAl we 
obtain a non-demonic specification. This is because two is only enabled at state 
2 and not on all states that can be generated by random. 

To motivate the use of demonic specifications and languages and to give a 
few generic examples, we state the correspondence to the commonly used failure 
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and trace models in the theory of communicating sequential processes (CSP) 0 

0 . 

Given a specification S = {Op, St, In, Out, Init, _^) and a fresh symbol C we 
define two derived demonic specifications, the total completion 

S{TC) = {Op,StU{C}In, OutU{C},lNiT,J^^^^) 

and the failure completion 

S{FC) = {Op,StU{C},In,OutUF{Inx Op x Out), Init, 

of S. For every operation symbol op G Op we define 

opS{TC) _ y £ Jn /\ g £ St U {(}} 

opS(PC) = op^ U {((i, s), {(, X)) : L G In A s G St A X G ¥{In x Op x Out) 

A G Out : (((., op),uj) G X 

AV{{io, opo),ujo) G X$s' G St : {{lq, s), (s' ,uo)) G opo^ } 
U{((t,C), (C,0)) : t G /n} 

In both specifications the state ( is interpreted as the state that the system 
enters after a failure occurred. The total completion extends the behaviour of 
the original specification by assuming that a failure can occur in any state, no 
matter what the input is: the system may enter the failure state at any moment. 
The failure completion handles failures in a more sophisticated manner. A failure 
transition can only occur if the transition was impossible in the original system. 
With the help of the set X the modified system can output failure transitions 
at any state. 

The total and the failure completion of a specification S are demonic speci- 
fications. 

Proposition 1. For every specifieation S, S{TC) and S{FC) are demonic with 
total languages Cs(tc) o.'nd Cs{fc)- 

Defining the failures of a specification S by 

failures(S) = {{h,X) : h G C$ A X G F{In x Op x Out) 

A 3 s G finals{h) V((t, op),Uj) G X^s' G St : ((t, s), (s' , iw)) G op^} 

we can formulate the following theorem. It shows that prominent failure and 
trace models of CSP PD can be expressed via demonic specifications and the 
ordering relation (s. For brevity, and because the theorem is not used later on, the 
proofs have been omitted from the paper. The first equivalence is rather obvious 
and states that the inclusion relation on the languages (input/output trace sets) 
of the specifications is characterised by the (e relation of the underlying total 
completions. The second equivalence characterises the inclusion relation on the 
failure sets of specifications in terms of the d relation on the underlying failure 
completions. 
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Theorem 1 . Let and be any specifieations. Then, the following eorre- 
spondences hold. 

i) ^S^(TC) ^S^(TC) C 

ii) Ls'^(fc) <£ ^S^(FC) failures (S^) C failures(S^) 

3 Lattice of Demonic Languages 

In this section, we present the lattice structure that is induced on the prefix- 
closed and demonic languages by the ordering relation d. First we introduce 
operators that define the supremum and infimum in the lattice. To obtain a 
complete lattice, we add the new symbol _L as the smallest element to both 
Canu and Can^. 

Definition 6. i) Cg = {e}, Canu* = Canu U {-L}, Canu = Canu {-*-} 
ii) For a nonempty family of languages Ci G Canu, i & I: 

/T(A) = n,e/Tr(A), UT{C,) = U,eiTr{C,) 

Hi) For a nonempty family of languages Ci € Canu, * G I ■ 

\/ j Ci = \J{Qci(j) I i & I /\ T £ IT{Ci)} 

^f_L if 3 t £ UT{Ci) : n{l 7 £.(r) | i £ I At £ Tr{Ci)} = 0 
^ * 1 U{n{l 7 £,(r) I i £ I A T £ Tr{Ci)} \ r £ UT{Ci)} otherwise 

p „ _ I _L if$C£ Canu '■ C CAj Ci A Tr{C) = Tr{Aj Ci) 

^ * 1 ^ U{£ G Canu \ C CAj Ci A Tr{C) = Tr{Aj Ci)} otherwise 

A ^f^C£ Canu ■ ^ ^ Tr{C) = Tr{Aj Ci) 

^ \ U{£ G Canu I ^ —Cj Ci A Tr{C) = Tr{Aj Ci)} otherwise 

It is possible that for prefix-closed languages Ci, i £ I, the language Aj Ci is 
not prefix-closed. In fact, we will see that Aj Ci, if not _L, is the greatest prefix- 
closed language below Aj Ci (w.r.t d). Also, if all £i, z G / are demonic then 
Ap Ci is not necessarily demonic. This is illustrated by the following example. 
Assume operations Ci and C2 with no inputs (we will use Ci as a shorthand for 
the call (A, Cl), and similarly for C2), and outputs oi, 02, 03, and 04. We define 
the languages 

Cl = {e, ((ci, oi)), ((ci, 02)), ((c2, 03)), ((c2, 04)), 

((ci, Oi)(c2, 03)), ((ci, 02)(C2, 03)), ((C2, 03)(ci, O2)), ((C2, 04)(ci, O2))} 

C2 = {e, ((ci, Oi)), ((C2, 03)), ((C2, 04)), ((C2, 03)(ci, Oi)), ((C2, 04)(ci, O2))} 

£1, £2 are demonic languages with 

£1 A £2 = {e, ((ci, oi)), ((c2, 03)), ((c2, 04)), 

((ci, Oi)(c2, 03)), ((ci, 02)(C2, 03)) ,((C2,04)(C1,02))} 

£1 A^ £2 = {£, ((ci, Oi)),| ((C2, 03)) I, ((C2, 04)), 
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((ci, 0i)(c2, 03)), ((C2, 04)(ci, 02))} 

£1 £2 = {e, ((ci, Oi)), ((C2, 04)), ((ci, 0i)(c2, 03)), ((C2, 04)(ci, 02))} 



The following two lemmas prove properties of £i, Aj Ci, Aj £i, and Ci 
that will allow us to derive the complete lattice structure for languages ordered 
by (£. 

Lemma 1 For any nonempty family of languages Ci € Canu, i C d; 

a) Aj Ci G Can-H* and Aj Ci G Can^ 

h) If Aj Ci then Aj Ci (g Cj for all j £ I 

Proof, a) The union of prefix-closed languages is again prefix-closed, and the 
union of demonic languages with the same set of traces is also demonic nm 
Prop. 1]. 

b) For any j G I, TrlCj) C UT(Ci) = Tr(Aj CA and for r G TrlCA we get 

= n{f?£,(r) I i £ I At £ Tr{C,)} C (t) □ 

Lemma 2 For any nonempty family of languages Ci £ Can-u, i G I : 

a) Vj Ci £ Canu 

h) If all Ci £ Canu, then W j Ci £ Canu 

Proof, a) Let hz £ Vj Ci. Then, Xlfhz) £ IT{Ci) and since Ci, i £ I, is 
prefix-closed, we can conclude X{h) £ IT{Ci). Moreover, for all Jq S / for which 
hz G £ij,, because £ig is prefix-closed, we have h £ Ci^, and so, by definition of 
the V-operator, /i G Vj Ci. 

b) Let Ci, i £ I he demonic. Then we fix a trace r G Tr{\/ j Ci) \ {e} and 
some h £ Sly ^c.{front{T)) . Because 

jCiifront{T)) = U{f2ciifront{T)) \ i £ 1} 

we can find io £ I with h £ SI {front ( t)). Since Ci^ is demonic and r G 
dig/ Tr{Ci) there exists &h' £ (r) with h' f (#t — l) = h. We can conclude 

h' G fly^Ci{T) and so Vj Ci G Can^. n 

We also need the following result, proven in uni Prop. 5]. 

Lemma 3 For languages £,£' C 'H we have: 

a) C <sC ^ Tr{C n £') = Tr{C) n Tr(£') = Tr(£), C C Ci C 
h) {C demonic A £' (s £ ) => C C\ C demonic 

If we extend the ordering (g in a canonical way from Canu to Canu* such that 
T becomes the smallest element in Canu* , then we do obtain a complete lattice. 
Furthermore, we set \/ ^ Ci =T and A^ Ci =A^ Ci =A^ Ci — C^. 

Theorem 2. a) {Canu* is a complete lattice with greatest element C^ and 
smallest element T. For a family Ci, i £ I of languages in Canu, its supremum 
is I £;^_L} Ci and its infimum is T if there is at least one i £ I with Ci =T 
and Aj Ci otherwise. 
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b) {Can!^ , (e) is a complete lattice with greatest element and smallest 
element _L. For a family Ci, i G I of languages in Canl^, its supremum is 
#-L} 'I'nfimum is _L if there is at least one i G I with Ci =_L 

and f\'^j Ci otherwise. 

Proof, a) Let C G Can-u- Then, {e} = Tr{C;;) C Tr{C) and f?£(e) = {e} C 
Hence, C <& C^ which shows that C^ is the greatest element in Canu* ■ 
Let £i, « G / be a nonempty family of languages in Canu- Applying Lemma 
Ha) shows y j Ci G Can-H- For any j G I, Triy j Ci) = IT{Ci) C Tr{Cj). For 
any trace r G Tr{y j Ci), 

^Cjir) C Ujg/f2£,(r) = I7v,£i('r) 

Therefore, Cj (eVj Ci, j G I. 

Moreover, for any C G Canu with Ci (s C, we have Tr{C) C IT{Ci) = 
Tr{y J Ci) and for r G Tr{C) we have 

^v,c.{t) = Ujg/f2£,(r) C Ujg/I7£(r) = f2c{T) 

We deduce y j Ci <£ C. 

This proves that Vj Ci is the least upper bound of the family Ci, i G I in 
Can-n- It remains to extend this in a straightforward manner to Can-^* . 

Next we claim Ci (e C^ for every j G I . This is trivial if Ci =T and 
the remaining case has been shown in Lemma Hb). 

Therefore, Aj Ci CAj Ci (e Cj for every j G I. Since Aj Ci, if not equal to 
T, has the same set of traces as Aj Ci, we can infer 

Aj Ci (eAj Ci (E Cj 

for every j G I. Let C G Can-u* with C (e Cj, for all j G I. We must prove 

that C (eAj Ci, which trivially holds for C =T. For C yfT, we first prove 

that C <eAj Ci. Note that 0 yf Tr(Aj Ci) = UT{Ci) C Tr{C), Moreover, 
Tr{Cj) C Tr{C) for all j G I and, if r G UT{Ci) then 

^c{t) f= n{l7£,(r) I iG I At G Tr{C^)} = 

Hence, C (eAj Ci. 

We define C = CC\ {Aj Ci) and claim the following two properties. 

C' G Can-u (1) 

C C' (eAj Ci (eAj Ci (2) 

We show m first. Note that e G C . Let hz G C . C is prefix-closed and 
so h G C. Furthermore, X{hz) G Tr{Aj Ci) = UT{Ci). From Ci, i G I being 
prefix-closed, we can deduce X{h) G Tr{Aj Ci). Since C <sAj Ci, we can conclude 
h GAj Ci and therefore h G C . 

To prove 0, we note that Aj Ci (eA^ Ci, which was shown above. C is 
prefix-closed by (^1, and furthermore C CAj Ci and C (eAj Ci. By Lemma0a), 
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we can conclude Tr{C) = Tr{Aj Ci). Hence, by definition of the A^’-operator 
we obtain C CAj Ci. That C and Aj Ci have the same traces then implies 

C (sA? C, 

The property C <£ C follows from Lemma 0 a) and C <sAj Ci. 

From (0 we obtain C (eAj Ci by transitivity of (s, which is what we wanted 
to verify. 

b) It is obvious that C^ is demonic and since C^ is the greatest element in 
Can-H* , it follows that C^ is also the greatest element in Can^ . 

For any family Ci, i & I hi Can^ we get Vj Ci as its supremum in Can-u. 
According to Lemma 0b), Vj Ci is demonic and therefore, it is the supremum 
of Ci, « G / in Can^. 

It remains to show that Aj Ci is the greatest lower bound for Ci, i G / in 
Can^. From the definitions of A^ Ci and A^ Ci we can infer Aj Ci (sAj Ci and 
together with a) we then obtain 

A^ Cl (eAj Ci (£ Cj , j G I 

Now, assume C G Can^ with C d Ci, for all i G I. With a) we deduce 
C (eAj Ci (bAj Ci 

As above, by setting C — CC\ {Aj Ci) and by using LemmaOb), we obtain that 
C is demonic. Property holds again and by definition of Aj Ci, 

C CAj Cl 

Hence, C C Ca‘} Ci. Note that Tr{A‘j Ci) = Tr{Aj Ci) = Tr(C') which 
follows from Lemma 0a). Therefore, 

C<sC (sA^ Cl 

and from the transitivity of (s we can conclude that C (sAj Ci. □ 

The following results further explore the correspondence between the opera- 
tors A, A^ and A*^. 

Proposition 1 If Ci G Can-u, i G I , is a downward ordered net (i.e., Vi,j G 
I 3k G I : Ck <£ Ci A Ck Cj), then Aj Ci =A j Ci . 

Proof. For a downward-ordered net Ci, i G I, since A^ Ci, if not equal to T, 
is the greatest prefix-closed set in Aj Ci, we simply have to show that Aj Ci is 
prefix-closed. 

Let hz GAj Ci. There is io G I such that hz G C^,. C^, is prefix-closed, and 
so h G Cig. Thus, I{h) G Tr(Aj Ci). 

Now for all q G / such that I{h) G Tr{Cig), there exists an %2 G I with 
(s Cig and ^ Hence, X{hz) G Tr{Ci^) and hz G Ci^. Ci^ is prefix- 
closed and so h G Ci,^. From l7£;^(I(/i)) C j?£.^(I(/i)) we conclude h G Ci^. 
From this we conclude h GAj Ci, which proves that Aj Ci is prefix-closed. □ 
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Definition 7. A language C is t- output- finite for a trace r G Tr{C) \ {e} if for 
all histories h G ndfront^r)) the set 

{h' I /I'G f2c{r) A h' t #h = h} 

is finite. C is output-finite if it is t- output- finite for all t G Tr{L) \ {e}. 

Proposition 2 If Ci G Lan^, i G I is a downward-ordered net and for all r G 
Tr{Aj Li) there exists an i G I with Li being t- output- finite, then A j Li =Aj Li. 

Proof. It suffices to prove that Aj Li is demonic, assuming Aj Li ^_L. Let r G 
Tr{Aj £i)\{e} and h G f2/\^£;(/ront(r)). We find i\ G I with being r-output 
finite. Furthermore, is demonic, and so we can conclude fronton) G Tr{Lifi) 
and that the set 

A={h' I h' G (r) Ah'fffh = h} 

is nonempty and finite. We claim that there exists ssi h' G A with h' GAj Li. If 
not, we can find ji, ..., G I such that r G Tr{Ljfi) and (r) (lA = 

0. But there exists an i 2 G / with £^2 and £^3 (s £j^, 1 < fc < n. We infer 

TG Tr(£,J and h G Li.^. Again, Li^ is demonic and so 

0^{h' \h'G f2c,^ (r) Ah^t#h = h}c (r) n A 

which contradicts the assumption above. □ 

This shows that Aj Li is demonic, and hence Aj Li =A^ Li, for downward- 
ordered nets of output-finite and demonic languages Li, i G I . 

4 Refinement 

Data-refinement proofs miiiiii! are used to verify that a specification (or 
implementation) S ^ with a concrete state representation is correct with respect 
to a specification with an abstract state representation. In m, we show 
the soundness and completeness of refinement with respect to the ordering (S 
on demonic languages using a combination of forward and backward refinement. 
Here we generalise this notion of refinement to one that is based on an abstraction 
relation on the power sets of the underlying state spaces, and show that this 
provides a single sound and complete refinement rule. 

Definition 8. Given two specifications = {Op, St^, In, Out,lNiT^,_^ ) and 
= {Op, St'^ , In, Out, Init ^ ), an abstraction relation 

ABS : ¥St^ G^FSt^ 

and operation op G Op, we say that op^ power-refines to op^ (op^ Labs 
op^ ) if the following conditions hold. 
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(PI) V6£/n, ReFSt^, T eFSt^ : 

((T, R) G ABS A 0 7 ^ P C presA{{i, op))) => P C presc{{i, op)) 

(P2) yLGln,L0GOut,RG F St^ , T S F St^ , teT,t' £ St^ : 

{{T, R) £ ABS A 0 R C presA{{i, op)) A ((t, t), {t',u>)) £ op^'^) => 

(3P' € FSt^, r £ FSt^ : 

(R' ^0 A( r, R') £ ABS At' £ r 

A {y Si £ R'3so £ R ■■ ((t, So), (si,w)) G op^^)) 

Refinement with properties (PI) and (P2) lifts the common forward refinement 
of VDM and Z with abstraction relations on the state spaces to refine- 

ment with abstraction relations on power sets. (PI) states that if T and R are 
related via ABS and (i, op) is enabled in all states in R then (i, op) is enabled 
in all states in T (denoted by the dashed lines in the picture below). 




(P2) states that via ABS every input of op^^ must be accepted by op^'^ with 
outputs that were possible for op^ . The last condition in (P2) asserts that all 
states that are contained in the abstract state set R' and related to a concrete 
state set T' are final states under the abstract operation with input and output 
that were accepted by the concrete operation. This is illustrated in the picture 
below where the dashed lines indicate how the concrete transition is simulated 
by the abstract one. 



((t,hp),w) 




This notion of operation refinement naturally leads to the following notion 
of specification refinement. 

Definition 9. We say that specification S^ = {Op, St^ , In, Out, Init^ 
power-refines to specification S^ = {Op, St'^ , In, Out, Init ^ ) and write 
rA ^ S'^ if there exists an abstraction relation ABS as above such that 

(PRl) V op £ Op : op^^ Qabs op^° 

(PR2) {Init^ , Init^) £ ABS 
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We write Eabs if we want to explicitly indicate the abstraction relation 
ABS. 

We will see below that the relation C defines a preorder on demonic speci- 
fications. Obligation (PRl) states that every abstract operation can be power- 
refined to a concrete one and obligation (PR2) states that the abstract initial 
state set corresponds to the concrete initital state set. Note that we overload 
the semantics of the symbol O. It will be obvious from the context whether we 
mean operation or specification refinement. 

As an example, consider the following alternative specification SA2 of the 
random number generating specification S'Al of Section 12.21 

= Z U {undef} , = {0} 

random^'^^ = {{{±, sq), {undef , 1.)) \ sq G Z LI {undef}} 

val^^^ = {((-L, i), {i, z)) I z G Z} U {((-L, undef), (z, z)) | z G Z} 

In specification 5AI the operation random generates a random integer number 
and a subsequent call to val makes this number visible. In SA2, random will 
always set the state to undef and only a subsequent call to val will generate a 
random number. In m, we show that there is a backward refinement from S'AI 
to SA2, but no forward refinement. However, there is a power-refinement such 
that S'AI power-refines to SA2. An abstraction relation that shows this is 

= {({z}, {z}) : z G Z} U {({zzzzde/}, Z)} 

If we use the ordering (s as the underlying semantics of specification refi- 
nement and power-refinement with obligations {PRP) and {PR2) as the refine- 
ment technique, then Theorem |3 below states the soundness and completeness 
of power-refinement for demonic specifications. 

Theorem 3. For demonic specifications and , Cgc Cs^ ^ \Z . 

Proof. To prove Csc CgA ^ , we use the abstraction relation 

{T ,R) G ABS iff 3 /i G £sc fl Csa : T = finalsc{h) A R — finalsA{h) 

Because e G Cgc fl Csa, Init^ = finalsc{e), and Init^ = finalsA{e), we can 
conclude {Init^ , Init^) G ABS, which establishes (PR2). 

For (PI), we assume (T,R) G ABS and 0 yf P C pres a {{l, op)). According 
to the definition of ABS, there is h G Csa (1 Csa with T = finalsc{h) and R — 
finalsA{h). From I{h){i, op) G Tr{CsA) and Csc (s Csa, it follows I{h)(i, op) G 
Tr{Cso). S^ is demonic and therefore T = finalsc{h) C ptracesc{X{h)) C 
presc{{i, op)). 

To prove (P2), we additionally assume ui G Out, t G T and t' G St'^ 
with {{L,t),{t' ,uj)) G op^ . Hence, h{{L,op),u) G Csc and because of Csc (e 
Csa we have h{{i, op),u>) G Csa. For T' = finalsc{h{{i, op) ,uf)) and R' = 
finalsA{h{{i,op),uj)) we derive {T',R') G ABS, t' G T' and R' yf 0. Finally, 
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because R' = finalgA(h{{i, op),uj)), for any si € R' there is a sq C -R such that 
(((., So), (si,<^)) G , which concludes the proof of (-P2). 

To prove Cgc <s CgA <^= C , we first prove the following property by 
induction on the length of the traces in Tr{£gA). 



Vr = (ii, opi)...{L^r, op^r) £ Tr{CsA) 

o) 7?o = Init^ a To = 

&) V 0 < i < #r : U G Ti A {Ti, Ri) G ABS (3) 

c) V 1 < j < #r : ((tj, iU,Ui)) G opi^"^ 

d) \/0 < i < : 0 ^ Ri £ ptracegA^r f 0 

e) VI < « < #rVsi G i?j3so £ i?j-i : ((t^, so), (si, wj) £ 

Base case (r = e): We have {Init^ , Init^) G ABS and ptracegA^e) = 
Init^ 0 by definition. Similarly, we can select to £ 7b = Init^ ^ 0. 

Induction step (r' = r(i, op) £ Tr(CgA)): For r G Tr{LgA) we can find 
i? 0 , To, ..., T^.r, to, i#r,wi, that satisfy property (0 according 

to our induction hypothesis. Since is demonic, it follows from property 0 
d) that 0 R^r C ptracegAij-) C pregA{{i, op)). From (Tl) and {T^r,R#r)G 
ABS we can infer T#r C presc{{i, op)). Then we find t' £ St'^ , oj G Out such 
that {{L,t^r), {f ,uj)) £ op^^ . Property (T2) then supplies us with R' G PS't^, 
T' G PS't'^ such that R' yf 0, {T',R') G ABS and t' G T' . Furthermore, for 
any si G R' there is a so G R^t with ((t, so), (si,oj)) G op^ . We know by our 
induction hypothesis that R^t C ptracegA^r) and therefore R' C ptracegA(r'). 
The induction step is completed by setting R^t' = 7?' , T^r’ = T', = t' and 

= w- 

We proceed by proving the following property by induction on the length of 
the traces in Tr{Cgo) fl Tr{CgA). 



'ih = {{Lx,opi),u}i)...{{L^h,op^h),(AJn.h) G Cgo ■ I{h) G Tr{Cs^) ^ 
\/to,...J#h £ St^ : 

{to G Init^ a (VI < j < #/i : {{n,ti^i),{ti,uji)) G op^^"^)) ^ 

3 i?o, ..., To, ..., T^,j : 

а) Rq = Init^ a To = Init^ 

б) V 0 < i < #/i : UG Ti A{Ti,Ri) G ABS 

c) VO < J < #/i : 0 ^ RiQ finalgA{h f i) 

d) yi<i< #/iVsi £ i?j3so £ Ri-i ■ ((6j,So), (si,Wj)) £ op/'" 

Base case (h = e): This trivially holds because finalQA(e) = Init^ y^ 0 and 
{iNiT^ Jnit^) g ABS. 
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Induction step {h' = h{{i^h' , op^h ') G C-gc with I{h') G Tv^Csa)): 
Let to G G with ((t^, (t*, w*)) G op^^° , 1 < « < #h' . 

By the induction hypothesis there are sets Rq, ...,R^h , Tq, T^h that satisfy 
the property ©• That is 0 ^ finalgA(h) C ptracegA(I{h)) , {T#h, Rn^h) G 

ABS and , op^h') G Tr{CsA). The specification is demonic and 

therefore R^h C ptracesA(I{h)) C presA{{i^h' , op^h'))- With (P2) we then can 
find R' G T' G ^ St^ such that R' ^ 0, (T', R') G ABS and t^h' G T'. 

Furthermore, for any si G R' there is a Sq G R^h with {{L^h',so), {si,uj^h')) G 
op^h'^ ■ Since R^h C finalsA{h) we can conclude that R' C finalgA{h'). The 
induction step is completed by setting R^h' = R' and T^h' = T' . 

Finally, note that property (0 implies Tr{CgA) C Tr(Cgc) and from pro- 
perty (0 we can infer I2£^p(r) C for every trace r G Tr{£gA). This 

implies Csc (s C^a. □ 

As a corollary, power-refinement C defines a preorder on the set of demonic 
specifications. 

5 Conclusions and Related Work 

In this paper, we have extended the results on the refinement of demonic lan- 
guages from P33- We have presented a lattice structure for the set of demonic 
languages and proved the correspondence to the set of demonic specifications. 
By lifting the common notion of abstraction relations on state spaces to relati- 
ons on the power sets of state spaces, it was possible to derive a single complete 
refinement rule. This refinement rule provides a characterisation of the ordering 
on demonic languages in terms of a refinement preorder on demonic specificati- 
ons. This complements the classical characterisation by backward and forward 
refinement proven in m- 

The refinement ordering we use is consistent with the conventional semantics 
underlying a system specification in Z and VDM PP]. Hence the forward and 
backward refinement techniques of these specification languages are sound and 
complete methods for the verification of the refinement ordering (g on demonic 
languages. To our knowledge, there is no refinement theory founded on abstrac- 
tion relations on power sets of states. All relational approaches we are aware of 
work with abstraction relations on state spaces and therefore need at least two 
refinement rules for completeness results [I5lt)f7ltill I I I 4| I ,'tj or significant restric- 
tions on the languages and specifications. Furthermore, those approaches are 
founded on the subset ordering on prefix-closed languages. 

The restriction to demonic specifications and the generalisation to abstrac- 
tion relations on power sets make it possible to obtain a similar completeness 
result as in the predicate transformer setting |2| with only one refinement tech- 
nique. As for predicate transformers, finding abstraction relations with a single 
complete rule is in general more difficult than relying on the combination of two 
less complex rules. Hence for practical applications the combination of back- 
ward and forward refinement rules is recommended. Comparing the predicate 
transformer approach with our approach, Rewitzky and Brink m show that 
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predicate transformers can be interpreted as total functions on the power sets 
of the state spaces. The predicate transformer refinement ordering can then be 
seen as the subset ordering on the underlying behaviours, which implies that 
traces can disappear during the refinement process. Any cosimulation may be 
seen as a total function on the power sets of the state spaces. Our refinement or- 
dering on demonic languages and specifications differs from the subset ordering 
in that traces cannot disappear during the refinement. Furthermore, a demonic 
composition principle in specifications is more general than the composition by 
total operations. Also, in our relational approach, a restriction to abstraction 
functions on power sets would be insufficient. 

Acknowledgements: We thank Ian Hayes and the anonymous referees for their 
suggestions on earlier versions of the paper. 
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Abstract. Compositional design is concerned with both constructing 
systems by composing components and with deconstructing systems into 
proposed sets of components. In bottom-up design, engineers prove sy- 
stem properties given properties of components and a compositional 
structure. In top-down design, they propose properties of components 
and a compositional structure given system properties. In this paper we 
show how the theory of predicate transformers, which has been used 
so successfully in sequential programming, can be applied to composi- 
tional design of systems. The rules of composition we study are more 
general than the rules employed in sequential programming, and the sy- 
stems we study are not limited to programs. We exploit theorems about 
weakest and strongest solutions to equations to obtain a collection of 
useful predicate transformers, and then we exploit the theory of con- 
jugate transformers to obtain more useful transformers. We show how 
these transformers are useful for both bottom-up and top-down design. 



1 Motivation 

I. 1 Composition and Compositional Properties 

Composition is the most fundamental operation in design. Designers of space ve- 
hicles, buildings and programs have common concerns: How to compose systems 
from components and how to partition systems into components. Compositional 
design offers the hope of managing complexity by avoiding unnecessary detail: 
systems designers prove properties of systems given properties, but not detailed 
implementations, of components. 

We introduce an informal concept of compositional properties to motivate 
our exploration, and define terms precisely later. Compositional properties are 
those classes of properties that allow designers to deduce system properties from 
component properties using simple rules. For example, mass is a compositional 
property because the mass of a system can be deduced in a simple way from 
the masses of components: the system mass is the sum of component masses. By 
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contrast, temperature does not appear to be a compositional property because 
the temperature of a system depends in very complex ways on the shapes, masses, 
insulation properties, power consumption, locations of the components, etc. 

Designers have to compute properties of composed systems given properties 
of components, whether the properties are compositional or not. The challenge 
is to develop theories that help designers prove system properties they need from 
component properties. 

In this paper, we restrict ourselves to properties that are predicates on sy- 
stems. We explore properties that are compositional in the following sense: we 
can prove that a property holds for a system given that the property holds for 
some or all of its components. In later papers we propose to explore other kinds 
of compositional properties. 

In our paper, systems are abstract entities. They are not necessarily programs 
and they may not have “states” or “computations.” We consider composition 
operators that have certain algebraic properties, such as associativity, and we 
explore theorems that are derived solely from these properties. Our goal is to 
explore composition in the abstract as opposed to studying how composition is 
used in constructing specific kinds of systems. 

The simplest rules are those that establish that a property X holds for a sy- 
stem given that (i) property X holds for at least one component, or (ii) property 
X holds for all components. Therefore, in this paper, we restrict attention to two 
kinds of compositional properties: existential properties and universal properties. 
A property is an existential property exactly when, for all systems, a system has 
the property if there exists a component of the system that has the property. A 
property is a universal property exactly when, for all systems, a system has the 
property if all components of the system have the property. 



1.2 An Introduction to Property Transformers for Composition 

We motivate our exploration of predicate transformers by a few examples, and 
then develop the theory. 



Questions about the Weakest Existential Transformer. Consider the 
following specification S for a component F : All systems that have F as a, com- 
ponent must have a property X. 

We postulate that any system is a component of itself. Since F is a component 
of F it follows that F itself must have property A. If A is an existential property, 
then from the definition of existential properties it follows that A holds for all 
systems that contain F as a component. Therefore, if A is an existential property, 
the given specification S is equivalent to the simpler specification: A holds in F. 
What if property A is not existential? 

Suppose we can define a predicate transformer WS where Wil.A is an exi- 
stential property stronger than A. If we can demonstrate that component F 
has existential property WH.A, then any system that includes component F also 
has property WS.A, and therefore also enjoys the weaker property A. Let S' be 
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the following specification of F\ WS.X holds in F. From the above argument, it 
follows that specification S' is stronger than specification S. 

Examples of the kinds of questions that we wish to explore are the follo- 
wing. For a given property X, is there a weakest existential property at least as 
strong as XI And, if this weakest property exists and we define Wil.A to be this 
property, then are specifications S' and S equivalent? 



Questions about the Strongest Existential Transformer. Next, consider 
a dual set of questions. Given that system F has property X, what properties 
can we deduce about all systems that have F as a, component? If X is existential, 
then all systems that have F as a component also have property X. But, what 
if X is not existential? 

Suppose we can define a predicate transformer SE where SE. A is an existential 
property weaker than X.li F has property X then it also has the weaker property 
Sil.A and, since SE.A is existential, all systems that contain F as a component 
also satisfy SE.A. The obvious question to explore is: Can we define SE.A to be 
the strongest existential property weaker than A? And if we can, is SE.A the 
strongest property that holds for all systems that contain F as a component? 



Questions about the Conjugate Weakest Existential Transformer. Now, 
consider an engineer designing a system top down. The designer is given the spe- 
cification that the system must have property A. The designer asks the question: 
Can I restrict myself to considering only those components that have a property 
Y1 In other words, can we prove that any system that contains a component 
that does not have property Y also does not have property A? We will show 
that the conjugate of the weakest existential transformer is helpful in answering 
this question. 



Questions about Universal Transformers. We also consider the analogous 
case for universal properties. We explore the following question: Is there a wea- 
kest property Y such that, if all components of any system have property Y , 
then the system has property A? If A is a universal property and all compo- 
nents have property A then the system itself has property A. So, since Y must 
be at least as strong as A, Y is the same as A. What if A is not universal? 

We can introduce a predicate transformer WJ with the requirement that 
WJ.A is universal and stronger than A. If we can then prove that all components 
of a system have property WJ.A then we can conclude that the system enjoys 
this property and hence also enjoys the weaker property A. 

Can we require that WJ.A be the weakest universal property stronger than A? 
We can show that we cannot do so because there does not exist, in general, a 
weakest universal property stronger than A. What are good ways of defining 
WJ, then? We do not have answers to this question. 

In the main body of this paper, we explore similar questions about strongest 
universal transformers and their conjugates. 
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2 Terminology and Notations 

2.1 Composition 

We postulate a composition operator denoted by o and we restrict ourselves to 
systems built from a finite number of applications of that operator. We assume 
that o is associative. We do not assume other properties such as symmetry or 
idempotency. We do not interpret systems, and we do not consider how systems 
are constructed by composing elemental or atomic systems. All that is relevant 
in this paper is that systems can be composed to obtain systems. 

We postulate the existence of a binary relation yj and we only consider sy- 
stems FoG for those components F and G for which FyJG holds. We assume 
that FyjG denotes that F can be composed with G (in that order). 

We assume that if the system FoGoH can be constructed, then it can be 
constructed by first composing F with G and then composing the resulting 
system with F[ , or by first composing G with FI and then composing the resulting 
system with F on the left. Specifically, we assume the following property of yj, 
for any F, G and H: 

Fy/G A{FoG)y/H = Gy/H A Fy/{GoH) . 

We assume the existence of a UNIT component that satisfies the following 
axiom for all systems F: 

UNITy/F A Fy/UNIT A {UNIToF = FoUNIT = F) . (1) 

For some theorems, we need the additional axiom that the unit system cannot 
have non-unit components: 

(FoG = UNIT) = (F = UNIT) A (G = UNIT) . (2) 

Most results presented in this paper do not require this additional axiom. Ho- 
wever, some results do. These results are marked with a ® sign. 

2.2 Membership Relation 

We introduce a specific notation to denote that a system F is part of a system G: 

F < G = (3F, K : Hy/F A HoFy/K : G = HoFoK) . 

Note that, because of the axiom du on the UNIT element, <l is a reflexive 
operator and UNIT <l F is true for any F. 

2.3 Properties and Specifications 

Properties are point-wise predicates on systems. We treat a property as boolean- 
valued function with a single argument of type system. We use dot notation to 
denote function application, as in f.x denotes the application of function / to x. 
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Therefore, for any property X and any system F, the notation X.F denotes 
the boolean: system F has property X. Following lEl, we use square brackets 
to denote that a predicate is “everywhere true”. For a property X, [X] is the 
boolean: property X holds in all systems. 

We introduce two properties that are specific to the UNIT component, 
UNIT- (“to be the UNIT'") and its negation UNIT^ (“to be different from 
the UNIT”): 

UNIT^ . F = {F= UNIT) and UNIT^ . F = {F ^ UNIT) . 

2.4 Bags of Colored Balls 

In order to illustrate the ideas presented in this paper, we use a simple model 
of components. In this model, systems are bags of colored balls. Composition 
corresponds to bag-union and the UNIT element is the empty bag. Note that 
this bag model satisfies axiom 0. Therefore, properties marked with ® can be 
used when reasoning on this model. 

Bags can always be composed (i.e., the relation yj is always true) and com- 
position is symmetric (Abelian monoid), but we do not rely on these additional 
properties of the model for they may not be true of more interesting models. 
For instance, sequential composition of programs is not symmetric and parallel 
composition of processes may not always be possible (for example, if a process 
references a local variable from another process). 

3 Existential and Universal Properties 

3.1 Existential Properties 

A property X is existential (denoted by the boolean exist. X) if and only if X 
holds in all systems that contain a component that satisfies X: 

exist.X = (VF, G : F^JG : X.F V X.G ^ X . FoG) . 

The following results can be proved about existential properties: 

exist.X A exist.Y => exist.{X A Y), 
exist.X A exist.Y => exist.{X V Y), 
exist.X ^ (A . UNIT= [A]), 
exist. UNIT^ ® . 

We define a function guarantees , from pairs of properties to properties. The 
property A guarantees Y holds for a system F if and only if, for all systems G 
that contain F as a, component, if A holds for G then Y also holds for G: 

X guarantees Y . F = (VG : F <\G : X.G => Y.G) . 
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Proposition 1 For all properties X and Y, 

exist.{X guarantees Y) . 

Thus, guarantees provides us with a systematic way of building existential 
properties from other kinds of properties. The existential properties described 
with guarantees have been used successfully in compositional specifications and 
proofs of distributed systems jZj. 

3.2 Universal Properties 

A property X is universal (denoted by the boolean univ.X) if and only if X is 
true in all systems built from components all of which satisfy X: 

univ.X = (VF, G : Fy/G : X.F A X.G ^ A . FoG) . 

From the definitions of exist and univ, any existential property is also universal: 

exist. X => univ.X . 

Moreover, the following properties can be proved: 

univ.X A univ.Y => univ.{X A Y), 
univ.X A exist.Y univ.{X V F), 
univ. UNIT= . 

3.3 All- Components and Some- Component Properties 

We define two additional forms of composition as the duals of existential and 
universal composition. A property is all-eomponents if and only if its negation 
is existential: 

all-c.X = exist.{-iX) . 

Unfolding the definition of exist, we find that a property is an all-components 
property if and only if, when it holds for any system, it holds for all components 
of that system: 

all-e.X = (VF, G : Fy/G : X . FoG ^ X.F A F.G) . 

In the same way, we define some-component properties as the duals of uni- 
versal properties: 

some-c.X = univ.(~<X) . 

A property is a some-component property if and only if, when it holds for any 
system, it holds for at least one component of that system: 

some-c.X = (VF, G : Fy/G : X . FoG => X.F V F.G) . 

Existential and universal properties are used in bottom-up design. Their dual 
are used in top-down design. 
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— Bottom- up. Given components that have existential or universal properties, 
designers can deduce properties of systems composed from these components 
using very simple rules. 

— Top-down. Given that a system should satisfy an all-components property, 
designers know that they must design all components to also satisfy that 
property. Likewise, given that a system should have a some-component pro- 
perty, designers know that they must design at least one component to have 
that property. 

Of course, systems may be specified in terms of properties that are not exi- 
stential, universal, all-components or some-component. However, as shown in the 
next sections, any property can be related in a systematic way to some properties 
that have one of these characteristics. 

3.4 Examples of Properties 

In the bags of colored balls model, the following properties are examples of 
existential, universal, all-components and some-component properties: 



exist 


. (at least one blue ball in the bag) 


(3) 


exist 


. (at least two balls of different colors in the bag) 




univ 


. (no ball in the bag is blue) 


(4) 


univ 


. (all the balls in the bag are red, or at least two are blue) 




all-c 


. (all the balls in the bag are blue, or all the balls are red) 




all-c 


. (there are at most three balls in the bag) 




some-c 


. (the bag contains more blues balls than red balls) 


(5) 


some-c 


. (exactly one ball in the bag is red) 




Note that 


(0 is also some-component, 0) is also all-components and 0 



is also universal. All other example properties only have the stated characte- 
ristic (besides the fact that any existential property is universal and any all- 
components property is some-component). 

4 Property Transformers for Composition 

4.1 Extreme Solutions of Equations in Predicates 

As claimed in sect. 0 conjunctions of universal properties are universal and con- 
junctions and disjunctions of existential properties are existentiafl. An equation 
in predicates has a weakest solution if and only if the disjunction of all soluti- 
ons is itself a solution, and in this case the weakest solution is that disjunction. 
Likewise, an equation in predicates has a strongest solution if and only if the 
conjunction of all solutions is itself a solution, and in this case the strongest 
solution is that conjunction m- 



^ In sect. 0 junctivity is stated finitely for convenience, but actually holds for an 
infinite number of predicates. 
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4.2 The Property Transformer 

We consider the following equation in Z, parametrized by predicate X: 

Z : [Z ^ X] A exist. Z . (6) 

A property Z is solution of ® if and only if it is existential and stronger than 
X. Since disjunctions of existential properties are existential, equation 0 has a 
weakest solution, and we denote that weakest solution by W5.X: 

m,.X = (3Z : [Z ^ X] A exist. Z : Z) . 

For any property X, Wl.X is the weakest existential property stronger than X. 

Suppose we wish to design a system F so that any system that has F as 
a component enjoys property X. What properties must system F have? The 
next theorem tells us that the necessary and sufficient specification of such a 
component F is that it satisfies Wf.X. 

Proposition 2 Wil.X is the weakest property of a component that ensures that 
any system that contains that component will enjoy property X: 

{yF,G : F aG :Y.F => X.G) = [F ^ ViE.X] . 

This is a consequence of the following proposition that states that, for a com- 
ponent, to bring the property X to any system containing the component, or to 
satisfy Wil.X are equivalent. 

Proposition 3 Wil.X . F = (VG : F <\G : X.G) . 

Proposition 0 is proved in |H|. The following elementary properties hold as well: 

[AE.X =» X], 
exist.X = [WE.X = X], 

[VE.(XAF) = WE.XA’V^.F], 

[X ^ F] ^ [m.x => m,.Yf 
[m.x] = [X] . 

Property (0 states that the transformer Wi is conjunctive. It can be shown that 
Wi is not disjunctive. Property 0 expresses that Wi is monotonic. 

The property transformer Wl can be used in two ways. Firstly, it provides 
us with an abstract way of expressing that a component, all by itself, ensures a 
system property X. Moreover, proposition 0 can be used to derive proof rules 
for properties of the form Wil.X for a specific formalism. For instance, proof 
rules for Unity logic are used in the correctness proof of a distributed system 
in 0. Secondly, properties such as 0 and 0 can be used to reason about 
existential properties in general, without the inconvenience of a quantification 
over components. For instance, using proposition 0 it is easy to see that 

[X guarantees Y = Wf.(X => F)] . 

Then, it is easier to reason on WE to deduce properties of guarantees (such as 
existentiality or transitivity). 



( 7 ) 

( 8 ) 
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4.3 The Property Transformer ® 

The property transformer Wil was defined using the fact that disjunctions of exi- 
stential properties are existential. Because conjunctions of existential properties 
also are existential, the equation Z \\X ^ Z\t\ exist. Z has a strongest solution 

m.x-. 



Si.X = (VZ : [X ^ Z] A exist.Z : Z) . 

For any property X, SE.X is the strongest existential property weaker than X. 
Among interesting properties concerning SE, we can prove: 

[A ^ SE.A], 
exist.X = [SE.A = A], 

[SE.(AVT) = SE.AVSE.y], 

^ [SE.A^SE.F], 

[SE.X] = (A . UNIT)® . (9) 

Property (0) has an interesting intuitive explanation. SE.A holds for all sy- 
stems that have at least one component that satisfies A. Since all systems have 
UNIT as a component, for all properties A that hold for UNIT, all systems 
have property SE.A. So, as expected, designers do not get useful information 
from knowing that UNIT is a component of their systems. 

Proposition 4 SE.A is the strongest property that can be deduced of any system 
built from components, one of which, at least, satisfies X: 

(VF, G : F < G : A.F ^ y.G) = [SE.A ^ A] . 



4.4 The Property Transformer SU 

From sect. |3 we know that conjunctions of universal properties are universal. 
However, disjunctions of universal properties are not always universal. Therefore, 
we cannot define a transformer WJ in the same way as we defined Wl. Actually, 
it can be shown that some properties do not have a weakest universal property 
stronger than them. Nevertheless, because conjunctions of universal properties 
are universal, the equation Z : [X ^ Z] A univ.Z has a strongest solution SU.A: 

SU.A = (VZ : [A ^ A] A univ.Z : Z) . 

For any property A, SU.A is the strongest universal property weaker than A. 
Because SE.A is universal (since it is existential) and weaker than A, we can 
deduce: 



[SU.A ^ SE.A] . 
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Among interesting properties concerning SU, we can prove: 

[A ^ SQ.A], 
univ.X = [SU.A = A], 

[X [SQ.A ^ SU.r], 

[SU.A] ^ (A . UNIT)® . (10) 

Note that property m is only an implication because SU.A is, in general, 
strictly stronger than SE.A. Also, SU is monotonic but neither conjunctive, nor 
disjunctive. 

If all components of a system satisfy A, they also satisfy SU.A and, be- 
cause SU.A is universal, the whole system satisfies SU.A. Moreover, SU.A is the 
strongest property that can be deduced this way. 

Proposition 5 SU.A is the strongest property that can be deduced of any system 
built from components that all satisfy X: 

IfJF, G, iJ, • • • : Fy/G A FoGy/H A FoGoHyf • • • : 

X.F A A.G A X.H . FoGoHo ■■■) 

= [SU.A Y] . 



4.5 Example 

We consider the following question: What must be proved on a bag of balls to 
ensure that any system that contains that bag, if it contains at least one red 
ball, contains at least two balls of different colors? Formally, the specification of 
the component is that any system that contains that component satisfies: 

(at least 1 red ball) (at least 2 colors) . 

Then, from proposition O, we know that the specification for the component is: 

W5.((at least 1 red ball) (at least 2 colors)) . (11) 

In this section, we show how an explicit formulation of property (HU) can be 
calculated. The calculation relies on the following proposition, proved in |H|: 

Proposition 6 [A = UNIT^] V [YE.{ UNIT^ V A) = ’Wl.A]® . 

Then, we can calculate an equivalent formulation of (El): 

W5.((at least 1 red ball) (at least 2 colors)) 

= {Predicate calculus} 

Wil.( UNIT= V (at least 1 non red ball)) 

J Assume there exist red balls, hence 1 

~ {“'[(at least 1 non red ball) = UNIT^], apply prop.0 J 
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Wil.(at least 1 non red ball) 

= {exist.{a,t least 1 non red ball)} 

(at least 1 non red ball) . 

Therefore, we deduce that the necessary and sufficient specification of the 
component is that it should contain at least one non-red ball. In other words, 
to ensure that any system that uses T' as a component will have at least two 
balls of different colors provided it has at least one red ball, it is sufficient and 
necessary that F contains at least one non-red ball. This is consistent with our 
intuition. However, instead of guessing the desired property of F and then prove 
that it is both necessary and sufficient, we have calculated the property. This 
provides us with both the property and the proof at the same time, and avoids 
dealing explicitly with the universal quantification over components. 

5 Conjugates of Property Transformers 

Any predicate transformers has a unique conjugate. Duality allows us to easily 
deduce properties of a predicate transformer from properties of its conjugate. In 
this section, we focus on the conjugates of WH, SE and SU. 

5.1 Conjugate of a Predicate Transformer 

Let / be a predicate transformer. Its conjugate, denoted by /* is defined by m 

r.x ^ -/.(-X) . 

Duality provides us with a way of deducing properties of /* from properties 
of /, and vice-versa. For instance, / is monotonic iff /* is monotonic. In the same 
way, / is conjunctive (resp. disjunctive) iff /* is disjunctive (resp. conjunctive). 

5.2 The Property Transformer 

We define W3* as the conjugate of the property transformer WH: 

m*.x = -.vE.(^x) . 

Because the dual of existential properties are all-components properties, we 
can easily deduce that Wil*.X is the strongest all-components property weaker 
then property X. Moreover, applying the duality principle to proposition |21 we 
can deduce the following proposition. 

Proposition 7 W1*.A is the strongest property that can be deduced of all com- 
ponents of any system that satisfies X: 



{yF,G:F<G:X.G=^Y.F) = [VE*.X F] . 
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Duality can also be applied to the basic properties of to obtain the following 
properties of Wil* : 



[x=>y^*.x], 

all-c.X = [VE*.X = X], 
[VE*.(XVF) = VE*.X V VE*.r], 
[X ^ y] ^ [m*.x ^ m*.Y], 
[X = false] = [m*.X = false] . 



5.3 The Property Transformer 

In the same way as what is done in the previous section, we can study the 
conjugate of the property transformer SE. Applying duality, SE*.A is the weakest 
all-components property stronger than X. From proposition 0 we obtain: 



Proposition 8 SE*.A is the weakest property that must be proved on a system 
to ensure that all components of the system satisfy X: 

{yF,G:F<iG:Y.G^X.F) = [T ^ SE*.A] . 



5.4 The Property Transformer SU* 

SU*.A, as the conjugate of SU.A, is the weakest some-component property stron- 
ger than X. By duality of proposition El 



Proposition 9 SQ*.A is the weakest property that must be proved on a system 
to ensure that at least one component of the system satisfies X: 



(VA, G,H,---: F^G A FoGy/H A FoGoH^f ■■■: 

Y . FoGoHo ■■■ => X.F V X.G V X.H V • • • ) 

= [y ^su*.A] . 



5.5 Comparison of the Six Transformers 



The following table summarizes the intuitive interpretation for each one of the 
six property transformers that we have defined. Each transformer corresponds 
to a different view of composition: 



592 



M. Charpentier and K.M. Chandy 



\^.A 


What must be proved on a component to ensure that any 
system that contains that component satisfies A. 


^.X 


What can be deduced of any system that contains at least 
one component that satisfies A. 


SU.A 


What can be deduced of any system that contains only 
components that satisfy A. 


VENA 


What can be deduced on all components of any system 
that satisfies A. 


^*.X 


What must be proved on a system to ensure that all com- 
ponents satisfy A. 


SUAA 


What must be proved on a system to ensure that at least 
one component satisfies A. 



5.6 Example 

We consider another example that uses the bags of balls model: If a system 
contains exactly one ball and that ball is blue, what can we tell from its compo- 
nents? Using proposition 0 we can claim that all components in such a system 
must satisfy: 



Wil* . (exactly one ball and the ball is blue) . (12) 

Applying duality to proposition 0 we deduce: 

[A = UNITJ\ V A A) = Wi*.A]® . 

We can then calculate and equivalent formulation for da: 

Wil*. (exactly one ball and the ball is blue) 

= {Predicate calculus} 

W3*.(UA/T'^ A (all balls are blue) A (at most one ball)) 

= (Apply the dual of prop. El} 

Wil*. ((all balls are blue) A (at most one ball)) 

J Both properties are all-components and a conjunction of all-components 1 
(properties is an all-components property, for which [Wil*. A = A] J 

(all balls are blue) A (at most one ball) . 

This is consistent with our intuition: If a system contains only one ball and the 
ball is blue, then all its components must contain only blue balls and at most 
one ball. Note that the condition we obtain is no guaranteed to hold in a system 
even if it holds in all its components because the property “at most one ball” is 
not universal. 
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6 Conclusions 

This paper reports on an ongoing exploration of using ideas from the mathema- 
tics of program construction to develop a theory of compositional design. This 
paper shows how we exploit concepts from the axiomatic semantics of programs 
for designing systems. The specific constructs that we investigated in this paper 
are predicate transformers and their conjugates. The only assumptions we made 
about the composition operator were that it was associative and that it had 
a unit element. A sizable body of theory about transformers can be developed 
from only these limited assumptions. 

We started this study because of our conviction of the importance of composi- 
tion in systems design. We believe that systems, and especially software systems, 
should more and more be constructed from generic, “off the shelf”, components. 
This means that reuse of systems and components is going to be a central issue. 

Reuse of existing systems and description of generic components require a 
specification language that is abstract enough to be able to specify only the 
relevant aspects of a system and to hide operational details as much as possible. 
For this reason, we depart from the process calculus approach, such as in CSP or 
CCS, and focus on logical specifications. Especially, we are interested in applying 
our approach to concurrent systems specified with temporal logics. 

Of course, more abstract specifications lead to more difficulties in terms of 
composition. However, a great amount of work has been done in relation with the 
composition of concurrent systems described in terms of temporal logic specifi- 
cations. Among the logics that were considered, we can cite the linear temporal 
logic cni and some variants m. TLA 1^ or Unity IfillOllllimfil . It should be 
noted that all these works, and even more general studies done at a semantic 0 
^ or syntactic level, rely on the same fundamental hypothesis that systems 
are described in terms of states and computations. More precisely, the key point 
that allows systems to be composed is always the same: Components are speci- 
fied in terms of open computations, i.e., infinite traces that are shared with the 
environment. 

In this paper, we adopt a dual perspective on the question of composition. 
We do not want to consider systems that can always be composed (i.e., that are 
specified in such a way that composition is going to work) and we do not rely 
on the open computations assumption. Instead, we consider any kind of systems 
and look at what happens when they are composed. In this way, we hope to be 
able to reason on composition in the abstract and understand its fundamental 
issues. 

So far, this proposed framework helped us better understand the guaran- 
tees operator, which was originally defined in |S|, along with existential and 
universal properties. It is interesting to note that guarantees , when applied to 
temporal logic and reactive systems, gives us a powerful way to combine logi- 
cal specifications. Especially, the compositional characteristics (existential) of 
X guarantees Y does not depend on properties X and Y . For instance, both the 
left-hand side and the right-hand side can be progress properties, which is not 
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possible with usual assumption-commitment specifications. This leads to simpler 
proofs of composition, as in Q- 

Also, the framework emphasizes a symmetry between top-down design and 
bottom-up design. Conjugates of predicate transformers allow us to deduce theo- 
rems about top-down decomposition from corresponding theorems of bottom- 
up composition. We believe that, from a practical point of view, the top-down 
problem (identify suitable components) is as important as the more classical 
bottom-up problem (deduce system properties). 

In this paper we limited our discussion to properties that were existential 
or universal or conjunctions of such properties. We have also shown elsewhere 
m that nice proofs of correctness for significant concurrent programs can be 
developed using these concepts coupled with a logic such as Unity. In this pa- 
per we were not concerned with showing how these concepts could be used for 
programming; instead, we were primarily concerned with showing how concepts 
from programming can be applied to broad classes of systems in which composi- 
tion has simple properties such as associativity and existence of a unit element. 

Much further work needs to be done to develop an axiomatic semantics of a 
theory of compositional design. We have only begun to explore the area. Theo- 
rems that derive from further assumptions about the compositional operator 
must be developed. Properties that are not necessarily predicates on systems 
should be studied. In general, a property is a function from systems to some 
type that is not limited to booleans. A theory should be able to reason about 
properties such as mass and energy consumed. We believe that it is possible to 
construct axiomatic theories that help in understanding the basic principles of 
compositional design. 
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Abstract. This is a general introduction to the panel New Challenges 
for Theoretical Computer Science with panelists G. Ausiello, J. Gruska, 
U. Montanari, T. Nishizeki, Y. Toyama and J. Wiedermann. The aim of 
the panel is to point out and analyse new challenges theoretical compu- 
ter science has to deal with in order to develop properly as fundamental 
science and in order to respond well to needs coming from the outside 
world, especially to those informatization of society and global networ- 
king bring. 



Vision Theoretical computer science still keeps being developed too much ac- 
cording to the same principles as pure mathematics - following mainly the inhe- 
rent needs to develop more powerful concepts and proof methods, to generalize 
results, to simplify proofs, and to solve important/interesting or long standing 
problems. Most of the pure mathematicians hardly look to the outside world for 
its needs and motivations. 

Theoretical computer science is believed here to have some needs to follow the 
same research paradigms as mathematics does, but to have ultimately also much 
broader goals: To study fundamental laws and limitations of the information 
processing world and also to contribute to the development of the new, the 
third, main methodologj0 of mankind in general and of science in particular. 
Moreover, theoretical computer science has a need and obligation to make its 
contribution to solving new big and difficult problems informatization and global 
communication bring. Theoretical computer science therefore needs more often 
and more carefully to look out of its theoretical vacuum to see whether it is 
concentrating correctly and enough on its main global aims, to see where the 
problems are it can and should help to solve and how really powerful are tools 
it develops. 

* Support of the GACR grant 201/98/0369 and of the grant GEZ:J07/98:143300001 
is to be acknowledged. 

^ This refers to the understanding that so far there were two main methodologies 
science had: the theoretical one (much represented and supported by mathematics) 
and the experimental one. Computer Science is believed to bring a third methodology 
that at the same time reinforces (substantially) old methodologies and helps to bridge 
them. 
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Challenges At the break of the millennia theoretical computer science is faced 
with several new, scientifically fundamental and practically very important and 
hard challenges. 

~ Due to several developments one can witness a liberalization of theoretical 
computer science from serving mainly computation and communication tech- 
nology. Theoretical computer science becomes more and more clearly funda- 
mental science that has to study the laws and limitations of the information 
processing world in all its forms and it has to concentrate on developing 
corresponding foundations, paradigms, concepts, models and tools. 

— An understanding of the limitations of current computing and communica- 
tion technologies and methods has turned attention to a more systematic 
study of the ways Nature can perform information processing. The goals 
are two sided: To develop new, more Nature based, information processing 
tools and also to get a better understanding of Nature. Quantum compu- 
ting, molecular computing and brain computing are among areas that keep 
bringing new challenges fascinating in both directions. 

~ Radically new questions concerning computing and its power starts to be se- 
riously asked and radically new theories and computational models need and 
start to be on the agenda. They range from theories and models connected 
with new applications and new computational modes to theories and com- 
putational models that try to understand brain, to be more powerful than 
“Turing models” , to capture the essence of learning and understanding and 
even to capture the essence of evolution and genesis of Universe. And not 
only that. Similarly as in physics the question “Is there a (computational) 
model of everything” is being raised. 

— Since society is increasingly dependent, see [2], on very large software sy- 
stems and on global communications, reliability of very big software systems 
and security of communications are becoming another very hard and very 
important problems theoretical computer science is asked and obliged to help 
to deal with - by developing corresponding insights, concepts, theories and 
methods. 

~ The existence of global networks and their size and complexity require to 
develop new theoretical tools to handle new issues created by them. For ex- 
ample, the task is to develop abstract semantics framework to deal with such 
problems of highly distributed networks as those created by needs to manage 
cooperation, control and synchronization as well as verification of systems 
with high mobility of codes (created in different programming languages) 
and computations. 

— Various limitations concerning information processing have been discovered. 
Some of them are very unpleasant. For example enormous number of prac- 
tically important problems are NP-complete. An important task is to find 
ways how to compute, in some reasonable sense, what cannot be computed 
in some very strict sense and worst case. 

— A variety of new challenges for theoretical computer science come also with 
fast growing needs to solve (efficiently) problems with enormously large data. 
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New dimensions to this problem are due to the facts that these data may be 
produced in a very distributed way and can keep being dynamically chan- 
ged. In addition, they need to be solved in an on-line way. This in turn 
brings to a new dimension problems of efficient distributed computing and 
of efficient use of hierarchies of (secondary) memories. Searching for textual, 
visual and sound data in wide area networks, with millions of nodes and all- 
to-all connections, as well as broadcasting of such data in global networks 
are examples of such problems. 

Some of these challenges are discussed in more details in position statements of 

panelists. 



Teaching of theoretical computer science seems to be in a crises. Students keep 
challenging relevance of some subjects and ways theory is presented. Theoretical 
computer science seems to have needs to learn from other mature sciences, as 
physics, chemistry and biology to present its concepts, methods and results in 
much more synthetizing form and to concentrate mainly on outcomes with clear 
importance (see [1], for an attempt to do that). New challenges theoretical com- 
puter science is to deal with require also that its fundamental and powerful of 
concepts, methods and results are presented to students in a much more mature 
and impressive way. 



Epilogue Theoretical computer science research should get out of its “concen- 
tration on Ig* n improvements” and its education should be such that students 
do not remember it mainly as a “pumping-lemmas stuff” . Big new challenges of 
theoretical computer science and enormous potentials it has require that. 
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Considering the tremendous growth of computer science in the last twenty 
five years, since the first personal computer made its appearance, and the even 
stronger impetus that to this growth was given by the creation of the World Wide 
Web, I do not think that anybody could dare to say what computing will be like 
during the 21st century. We are by now used to quite unexpected technological 
leaps that change the picture that is in our hands while we are still trying 
to analyze it and we do not know what else is waiting us behind the corner. 
May be, although I am quite skeptical, that some technological breakthrough 
will make molecular computers or quantum computers available earlier than we 
might expect and that this might dramatically change our view of computing. 

On the other side, on a more conservative ground, if we look at a closer time 
scale, the next ten years for example, some driving trends of computing appear 
rather clearly. We may then be able to foresee the evolution of computing on the 
basis of the current directions of technology and applications and, accordingly, 
to pinpoint some of the contributions that theoretical research can give to such 
evolution. 

First of all, the most impressive push to the technological evolution of com- 
puter science derives from the ” all connected to all” paradigm and the use of IP 
as the basic support for this paradigm. This enormously increases the request for 
guaranteed quality of service performances on the internet in terms of voice and 
image communication, broadcasting, searching and web caching, correctness of 
mobile code migrations, security etc. Among other problems this development 
implies the study and design of new efficient and powerful techniques for search 
engines as it is witnessed by the Search Engine Watch and by the fact that im- 
portant research institutions have started projects in this area such as the Clever 
Project (IBM, San Jose’) and the Google Project (Stanford University). 

An interesting aspect of the scientific development in this field is related to 
the fact that, in order to design web searching engines we need to represent and 
manipulate large portions of the web, a graph with millions of node and billions 
of arcs. This leads us to the second characteristic aspect of the current evolution 
of Information Technology: the need to manipulate very large input data. While 
software applications for the final user become (apparently at least) simpler and 
simpler, the applications involved with the management of ITC systems become 
more and more complex and resource demanding. For example, for defining their 
billing strategies telecommunication companies want to know the communication 
graph of all calls among their customers. Also scientific computer applications 
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require to deal with huge data sets (very long strings in biology, very large 
matrices in theoretical physics, very large instances for finite element methods 
in fluid dynamics etc). In all these cases the classical asymptotic analysis is 
not adequate for describing and interpreting the behaviour of programs. We 
have to rethink our computation models for allowing the analysis and design of 
algorithms on secondary storage and for predicting the impact of locality and 
cashing strategies on different algorithms. Also in this area, research efforts are 
already being carried out in various Universities worldwide, among others, Duke 
University in the US and the Universities participating in the European Union 
research project ALCOM-FT. 

A third direction of computer evolution that can be clearly identified and 
that requires new answers from theoreticians is related to the increasing role of 
dynamic and on line applications. In this type of applications (think of routing 
problems, scheduling, resource allocation, investment strategies etc.) data are 
not static but evolve in a somewhat unpredictable way and we have to face 
such evolution by maintaining our data in such a way that we do not have to 
recompute from scratch the solution of a problem at any change of input data 
but we can just rearrange it with a limited cost. At the same time we want to 
define strategies for problem solution that, while not being optimal, anyhow are 
’’competitive”, that is do not lose too much with respect to an off line algorithm 
that, ahead of time, is aware of the whole sequence of changes in the input data. 
The efficient and competitive solution of dynamic and on line problems has been 
studied over the last fifteen years but still practical applications of theoretical 
results are limited by the fact that theoretical models are inadequate and a lot 
more research efforts should be spent in this domain. 
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Maturing of informatics Recent developments in the area of quantum in- 
formation processing and communications (QICP) made clear that foundations 
of computing have to be based not on the laws and limitations of the classical 
physics as so far, but on the laws and limitations of quantum physics. 

This discovery has several profound impacts on theoretical informatics, on 
its goals and methods and brings also a new series of the very basic challenges 
for theoretical informatics. This discovery has also significantly increased libera- 
lization of theoretical informatics from being mainly a servant of computing and 
communication technology and applications and put it into the position of the 
fundamental science, that has also strong impact on creation and development of 
new computing and communication technologies. This liberalization is expected 
to have far reaching consequences on how we should and will see foundations, 
aims, paradigms and methods of theoretical informatics. 

Another implication of the quantumization of (theoretical) informatics is that 
it helped to see more clearly basic scientific goals of theoretical informatics: to 
discover and study the laws and limitations of the information processing world 
(whatever it means). This can also be viewed as a parallel to the main goal of 
physics - to discover and to study the laws and limitations of the physical world 
(whatever it means). 



Going quantum Theoretical informatics has already started to get involved in 
the study of problems of QIPC. Moreover, its contributions, especially concer- 
ning the power of quantum algorithms and communication protocols, security 
of cryptographic protocols, quantum error correction codes and fault tolerant 
computations, as well as the existence of efficient universal quantum Turing ma- 
chines, have been crucial for the development of the field. However, one has also 
to see that theoretical informatics has concentrated so far mainly on problems 
that can be seen as natural quantum variations of the classical problems. For 
example, to study such models of quantum automata that can be seen as direct 
quantumizations of the classical models. In other words, theoretical informatics 
is going quantum, but only very slowly and cautiously is getting out of its old 
views of the information processing world and its principles, laws and limitations. 

* Support of the GACR grant 201/98/0369 and of the grant CEZ:J07/98:143300001 
is to be acknowledged. 
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The aim of this position statement is to point out that theoretically equally 
interesting, and perhaps even deeper and more important, are problems of quan- 
tum information processing of a different type, those that are inherently quan- 
tum, and to try to turn attention of (theoretical) informatics community to such 
problems for two reasons: (a) they are of large (key) importance for the deve- 
lopment of the field; (b) the research paradigms and methods of (theoretical) 
informatics seem to be well suited to deal with these problems; (c) they are 
intellectually exciting and rewarding as well as fundamentally deep. 

A related underlying thesis is that paradigms, concepts, methods and results 
of theoretical informatics are so fundamental that they should be expected to 
have a very broad range of applications, similarly as those of quantum mechanics. 

Main new features theoretical informatics has to deal with when going quan- 
tum can be summarized as follows. 

— New information processing and communication resources (quantum super- 
position, parallelism, channels, entanglement and measurement). 

— New information processing primitives (qubits, unitary operators, projection 
measurements, density matrices, superoperators and POV measurements for 
computation and bits, qubits and ebits for communication. 

— New concepts of feasibility and new computational complexity classes. 

— New types of (quantum) gates, circuits, automata, algorithms and protocols. 

— New concepts of (quantum) information and new (quantum) information 
theory. 

— New concepts of security (based on the laws of physics). 

— New goals: (a) Those oriented to quantum physics (for example, a deeper 
understanding of such quantum phenomena as measurement, non-locality, 
decoherence), (b) Those oriented to theoretical informatics (for example, 
development of complexity theory on the basis of the quantum information 
processing laws and limitations), (c) Those oriented to quantum information 
processing (for example, an understanding of the computational and com- 
munication power of entanglement, a development of quantum information 
theory and of new computation and communication paradigms). 

Interesting and important enough, it has been to a large extend due to the 
insights and results on the computational power of quantum information pro- 
cessing that an understanding has emerged that enormous experimental and 
developmental effort clearly needed to build powerful quantum information pro- 
cessing systems could pay off. This was followed by an observation that the 
research in quantum information processing can contribute also to a new and 
deeper understanding of quantum physics and by that also of Nature. A con- 
tribution to the development of such an understandings can be seen as one of 
the major impacts of concepts, methods and results of theoretical informatics 
in general, and of (computational) complexity theory in particular, on other 
sciences. 

Quantum information processing is an area where very fundamental questi- 
ons concerning Nature are being (re) asked and need to be answered. An area 
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where one finds various mysterious and paradoxical phenomena. An area where 
it is very nontrivial to find mathematical equivalents to the basic concepts and 
phenomena of Nature and proper interpretations, in terms of Nature elements 
and phenomena, of the mathematical concepts being used. 



Quantum resources Power of quantum information processing and commu- 
nication is based on an appropriate use of several new quantum resources. 

— Quantum superposition. An n qubit quantum register can be simultaneously 
in a superposition of all 2" (standard) basis states {|cc) | a; G {0, 1}"} of "H 2 "- 
In addition, an equally weighted superposition of all the basis states, 

i^) = i (1) 

an important initial state of many computations, can be created in a single 
step using only n simple quantum gates. 

— Quantum parallelism. In one step of a quantum evolution exponentially many 
classical computations can be “performed” . For example, for any function 
/ : {0,1,... , 2" — 1} — )> {0, 1, . . . , 2" — 1}, there is a unitary mapping Uf : 
\x, 0) — >■ |a;, f{x)) and if U f is applied to the exponentially large superposition 

(see0 of two n-qubit registers, then in one step the state 

= i (2) 

i=0 

is created and therefore exponentially many values of / can be “computed” 
in one step. 

— Quantum entanglement. Some quantum states of a bipartite quantum sy- 
stems cannot be expressed as tensor products of the states of its subsystems, 
for example the state ;^(|01) — |10)). Such pure states are called entangled 
and they are a puzzling phenomenon because they exhibit non-locality fea- 
tures. Indeed, it may be the case that two particles in an entangled state are 
far from each other. In spite of that any measurement of one of the particles 
“automatically” determines the state of the second particle. 



Quantum limitations Quantum physics imposes also several severe limitations 
on quantum information processing. 

— Heisenberg’s uncertainty principle says that measuring the value of one ob- 
servable more accurately makes the value of another, noncommutative, ob- 
servable less certain. In addition, there is a certain intrinsic uncertainty with 
which values of two observables can be measured, and once a way of measu- 
rement is fixed, this uncertainty, in general, increases. 
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— No cloning theorem. Very important for quantum information processing 
is the result that an unknown quantum state cannot be cloned. (Namely, 
there is no unitary transformation U, such that for any one-qubit state ji/'), 
U{\ip, 0)) = \ip, tp).) No cloning theorem seems to be a bad news. For example, 
it was to a large extent due to this theorem that quantum error-correcting 
codes seemed to be impossible. However, no cloning theorem is also a good 
news. Due to the impossibility to copy quantum information safely we have 
quantum key generation systems that are provably unconditionally secure. 

— Holevo theorem. It seems that a state of a quantum register can contain 
unlimited amount of information because amplitudes can be numbers with 
infinite decimal expansion. However, because of the peculiarities of quantum 
measurement the amount of information we can get out of a quantum state to 
the classical world is very limited. Holevo theorem says that into an rr-qubit 
register state we can encode and later decode safely only n bits. 

— Decoherence. This is a term for the processes of coupling of quantum systems 
(processor) with their environment, if they are not perfectly isolated. As a 
consequence, quantum states of the system are modified due to the inter- 
actions of the system with the environment. Such interactions mean that 
quantum dynamics of the environment is also relevant to the operations of 
the quantum computer and its states become entangled with the states of 
the environment. Decoherence tends to destroy irreversibly information in a 
superposition of states in a quantum computer in a way we cannot control 
and t herefore long computations, as well as long-term memories, seem to 
be impossible. 



Successes of quantum information processing Perhaps the main apt kil- 
ler for quantum information processing so far are Shor’s quantum polynomial 
algorithms to factorize integers and to compute discrete logarithm and Grover’s 
result that one can search in an unordered database of n elements using only 
0{^/n) queries, and not 0{n) queries as in the classical case. The main open 
problem is whether there are quantum polynomial algorithms to compute the 
so-called Hidden subgroup problem for non- Abelian groups. 

Quantum teleportation, due to Bennett and Brassard, experimentally veri- 
fied, has been shown to be the most novel and useful tool for quantum commu- 
nication. 

Quantum cryptography, initiated by Wiesner, showed that provably, on the 
basis of the laws and limitations of quantum physics, unconditionally secure 
quantum key generation is possible. This brings new dimension to the problem 
of secure communication. 

Due to the ideas coming from theory several ways have been found to fight 
quantum decoherence - the main enemy of the potential quantum computers - 
for example, Shor’s quantum error-correcting codes and qauntum fault-tolerant 
computations. 
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Main quantum challenges can now be seen as follows. 

— To understand quantum measurement and to get insights how much resour- 
ces it needs. 

— To study quantum entanglement, and its different types, as a new and power- 
ful computation and communication resource - to study ways entanglement 
is detected, measured, created, stored, transmitted (even teleported), mani- 
pulated, transformed (from one form to another), quantified and consumed 
(to do some useful work). 

~ To study multipartite entanglement, its use for distributed quantum infor- 
mation processing and for physics in general and to discover the laws and 
limitations how entanglement can be shared. 

— To study quantum (pure) states as a precious resources. (Of the large im- 
portance is to find out how much of quantumness we really need and how 
pure it has to be in order to have systems that are still more powerful than 
classical ones). 

— To develop quantum information theory. The importance of quantum infor- 
mation theory follows from the observation that if we want to be able to 
understand and to utilize fully the information processing potential availa- 
ble in Nature, then the concepts of classical information theory need to be 
expanded to accumulate quantum information carriers. 

— To discover new techniques (in addition to those of Simon/Shor and Grover) 
to design quantum algorithms more powerful than classical ones - if they 
exist. 

— To deal with quantum mysteries. (The existence of various mysterious quan- 
tum phenomena, often termed as paradoxes, accompanies the development 
of quantum mechanics from its beginnings.) Quantum mysteries did not used 
to bather too much “working physicists” and therefore there has been a ten- 
dency to see these mysteries as topics mainly the philosophers of science 
should deal with. However, now when research in quantum information pro- 
cessing and communication aims to discover the laws and limitations of both 
physical and informational world, the existence of these mysteries cannot be 
ignored any longer. Moreover, theoretical informatics seems to have a poten- 
tial to help to deal with these mysteries. 

Conclusion Potential of QIPC starts to be widely recognized. A number of 

national and international programs and initiatives has been set up for QIPC. 

For basics and main outcomes of QIPC see Gruska (1999). For main challenges 

of QIPC, especially for theoretical informatics, see Gruska (1999a). 

References 

1. Jozef Gruska. Quantum challenges. In Proceedings of SOFSEM’99, LNCS 1375, 
pages 1-28, 1999. 

2. Jozef Grnska. Quantum computing. McGraw-Hill, 1999. See also additions and 
updatings of the book on http://www.mcgraw-hill.co.uk/grnska. 




Two Problems in Wide Area Network 
Programming* 

(Position Statement) 

Ugo Montanari 

Dipartimento di Informatica, Universita di Pisa, Italy 
ugoOdi . unipi . it 



Motivations Highly distributed networks have now become a common plat- 
form for large scale distributed programming. Internet applications distinguish 
themselves from traditional applications on scalability (huge number of users 
and nodes), connectivity (both availability and bandwidth), heterogeneity (ope- 
rating systems and application software) and autonomy (of administration do- 
mains having strong control of their resources) . Hence, new programming para- 
digms (thin client and application servers, collaborative “peer-to-peer” , code-on- 
demand, mobile agents) seem more appropriate for applications over internet. 

These emerging programming paradigms require on the one hand mecha- 
nisms to support mobility of code and computations, and effective infrastructu- 
res to support coordination and control of dynamically loaded software modules. 
On the other hand, an abstract semantic framework to formalize the model of 
computation of internet applications is clearly needed. Such a semantic frame- 
work may provide the formal basis to discuss and motivate controversial de- 
sign/implementation issues and to state and certify properties in a rigorous way. 

Concern for the limited understanding we have of network infrastructure and 
its applications has been explicitly expressed in the US PITAC documents |0|. 
Also a theme on mobile/distributed reactive systems has been suggested as a 
new, proactive initiative for long-term, innovative research in a recent meeting 
promoted by the European Commission within the FET part of the V Framework 
programme. 

Here we point out and shortly outline two issues which arise in the definition 
of such an abstract semantic framework. 

Synchronization in a Coordination Framework Coordination ^ is a key 

concept for modeling and designing heterogeneous, distributed, open ended sy- 
stems. It applies typically to systems consisting of a large number of software 
components, - considered as black boxes - which are independently programmed 
in different languages, and may change their configuration during execution. 

While most of the activity is asynchronous, some applications, typically com- 
puter supported collaborative work or transactions among multiple partners, 
need primitives for synchronization. Synchronization might consist of complex 
computations to be performed by all partners on shared data before the global 

* Research supported by TMR Network GETGRATS and by MURST project TOSCa. 
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commit action takes place. While these primitives will be implemented in terms 
of asynchronous protocols in the lower levels of system software, it is important 
to offer to the user a high level model of them, to be employed for specification, 
validation and verification. 

The concepts developed for concurrent access to data bases, like serializability 
or nested transactions, are not fully adequate in this setting, since they refer 
to restricted models of computation. Instead, such issues should be considered 
like name and process mobility, causal dependencies between interactions, and 
refinement steps in the design process. We see this as a challenging problem, 
with aspects of logic, semantics, concurrency theory, programming languages, 
software architectures, constraint solving and distributed algorithms. 

As a contribution we just mention two pieces of work by the author and 
collaborators. The first is about a generalized version of Petri nets: A zero-safe 
net m is in a stable state when certain places are empty. Step sequences from 
stable states to stable states through non-stable states are transactions, and 
correspond to transitions of an abstract net. The second piece of work is a model, 
called tile logic (see http://www.di.unipi.it/~ugo/tiles.html) which extends 
structured operational semantics (SOS) by Gordon Plotkin and rewriting logic 
by Jose Meseguer. Tiles are inference rules which can be combined horizontally 
to build transactions and vertically to build ordinary computations. 

Finite State Verification with Name Creation Finite state verification is 
possible when threads of control are independent of the actual data. In this case 
an automaton encompassing the whole state space can be constructed, possibly 
minimized, and checked for properties of interest expressed in some modal or 
temporal logic. This technique has been successfully and extensively applied to 
hardware and protocols. 

When the computation involves the creation of new names which can occur 
in transition labels, the ordinary finite state techniques cannot be applied, since 
dynamic allocation of names with reuse of the old ones is required if the states 
must remain finite. On the other hand, in the coordination approach control 
is actually often independent from data, since computation mostly consists of 
redirecting streams, connecting and disconnecting users, transferring them from 
an ambient to another, resolving conflicts and performing security checks. Crea- 
tion of new names is actually quite common in wide area programming, since 
nonces generated during secure sessions, process locations, and causes of forthco- 
ming events can also be represented as names. Finite representation techniques 
for such systems together with expressive logical frameworks and efficient algo- 
rithms are needed for checking security and other global and local properties of 
distributed applications. 

Some experience has been described in the literature with 7r-calculus verifi- 
cation mm- Also Marco Pistore and the author have introduced certain clas- 
ses of automata, called History Dependent (HD) |5|, which are able to allocate 
and garbage collect names. Behavioral properties related to dynamic network 
connectivity, locality of resources and processes and causality among events can 
be formally verified on finite HD-automata. However efficiency is not satisfac- 
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tory and lots of work remain to be done about the theoretical and practical 
applicability of these methods. 
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Theoretical computer science has offered various computation models like 
automata, Turing machine, lambda calculus, and their importance to theoretical 
work and to practice is widely recognized. The progress of the classical topics of 
formal language theory, computational complexity, and mathematical semantics 
of programming languages is due to these computation models. In this talk we 
propose two challenges that have tight connection to mathematical theory of 
programs and probably need new computation models to attack them. 



Computation Model for Algebraic Programming 

To set up algebraic equations is very common approach for many elementary 
mathematical problems, and we can easily find the solutions satisfying these 
equations by very simple algorismic method without deep knowledge of mathe- 
matics. The final goal of algebraic programming is to offer an analogous fra- 
mework for deriving programs from algebraic specification for problems to that 
in elementary mathematics. Lambda calculus gives a useful mathematical basis 
in the study of lambda equations, but its theory is too mathematical and com- 
plicated for the practical use. Recent development of program transformation 
techniques by fusion and pairing shows an interesting direction of automatic al- 
gebraic derivation of functional programs, though their application is still too 
limited. The next step of algebraic programming seems to be to try to discover 
new computation models that unify the both directions. 



Computation Model for Flexible Semantics 

The most common mathematical presentations of operational semantics of pro- 
gramming languages are reduction systems. A reduction strategy chooses one 
out of many possible reductions, and can give an efficient deterministic interpre- 
ter through narrowing a given idealized nondeterministic semantics. The notion 
of reduction approximations is the inverse of reduction strategies, which bro- 
adens a given semantics to more flexible one. A decidable approximation, like 
the strongly sequential approximation, of term rewriting systems helps us to 
find a class of term rewriting systems having a decidable needed reduction stra- 
tegy. Thus, not only narrowing but also broadening semantics is important to 
design an efficient deterministic interpreter for a given operational semantics. 
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This observation easily leads us to an attractive idea of programming systems 
with flexible semantics, which can narrow or broaden the semantics of a pro- 
gramming language if necessary. Computation models for flexible semantics will 
provide new useful formal approaches to software science. 




Towards a Computational Theory of Everything 

(Position Statement) 



Jifi Wiedermann* 

Institute of Computer Science Academy of Sciences of the Czech Republic 
Pod vodarenskou vezi 2, 182 07 Prague 8 
Czech Republic 
j iri . wiedermannOcs . cas . cz 



Abstract. Is there any common algorithmic principle behind the evolu- 
tion of the universe, of life, of the mind? Should the currently prevailing 
computational paradigms which are limited by the Church-Turing thesis 
be revised in favor of more powerful ones? Can real systems exist whose 
computational power is provably greater than that of Turing machines? 
Is there a computational theory of everything? These are provocative 
questions that should be on the agenda of computer science for the next 
decade. 



The universe gave birth to life, life developed the intelligence, and man con- 
structed computers. Can this sequence be reversed? Can it be closed into an 
endless circle? Can a computer be intelligent? Can life emerge and exist within 
computers? Can man create a universe? 

By the beginning of the 20th century it was not even possible to formulate 
questions as those above. Nowadays these questions, and similar ones, are be- 
coming central topics in artificial intelligence and fundamental sciences such as 
physics, biology, mathematics, psychology, etc. In the intersection of the respec- 
tive efforts there is the youngest of all fundamental sciences — computer science. 
At the present time, theoretical computer science is about to attack the respec- 
tive problems by making use of its own tools and methods. Already now does 
it seem to possess the necessary knowledge and results that indicate at least 
partial answers to the questions at hand. The key to all answers lies in the very 
notion of computing. 

What is computing? What entities can compute? How do we recognize that 
something computes, that something is ‘computationally driven’? Where are the 
limits of computing? 

Of course, computers can compute. Can people compute? Can the brain com- 
pute? There seems to be a tremendous difference between what brains and com- 
puters compute. Is this a principal difference caused by some so-far-unknown-to- 
us computational mechanism? Or is it merely a matter of efficiency? Or perhaps 
a matter of a different computational scenario? Or one of a different computa- 
tional task that we cannot specify? Can the brain compute in a sense ‘more’ 
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than computers? The brain can be conscious; it can think. Can a computer be 
conscious as well? Can it have emotions? Can it think? Is there a continuum of 
minds? What does the respective hierarchy look like? Is thinking an algorith- 
mic process? Can we explain the emergence of mind, of the consciousness, of 
thinking, of language acquisition and generation in algorithmic terms? Can we 
understand understanding? 

To the best of our current knowledge, the answer seems to be positive. The 
necessary ingredient in algorithmic modeling of the previously mentioned cogni- 
tive tasks includes learning via endless, and in a sense purposeful, interaction 
with the environment. What kind of mysterious self-organizing, self-checking 
learning algorithm lies behind the mind evolution? What is the corresponding 
computational model? Is it a mere neural net? A robotic body equipped with 
senses and driven by a neural net? Should we go to quantum phenomena when 
looking for an answer? What about biocomputers? Will they replicate themsel- 
ves? Will they be alive then, in some sense? Will they be conscious? To what 
extent? 

On a larger scale, one can ask: can nature compute? Does nature compute? 
Is evolution of life also a computational process? Where is its origin? What are 
the underlying algorithmic principles? Where are its computational limitations? 
Does the unpredictable, non-algorithmic nature of interaction among the evo- 
lutionary systems lead to surpassing the Church- Turing barrier? Again: what 
are the right computational models to capture the essence of the evolution? Are 
genetic algorithms the answer? Is there a single computational paradigm behind 
all that? Is the Internet an evolutionary system? Can we ‘program’ it in a way 
that will give rise to some artificial intelligence in it? Can an autonomous (soft- 
ware) agent emerge in the respective virtual environment and exist in it? Will 
such or similar form of virtual life keep developing ad infinitum? Will also the 
intelligence of such an agent grow above any limits? 

Finally, on a still larger scale: the case of the Universe. Can we computa- 
tionally model the genesis of the Universe? Out of what initial information? 
Are all these wonderful machines such as cellular automata, genetic algorithms, 
quantum computers, neural networks, biocomputers, internets, etc., indeed the 
right means to model what we are after? Are the known computational resour- 
ces such as randomness, non-uniformity, fuzziness, quantum choice, interaction, 
really needed and/or sufficient to explain all phenomena of information exchange, 
forming and transformation? How faithful should our modeling be in order to 
capture all the necessary details? Where will the border be between the simula- 
ting system and the system to be simulated? Will then life emerge in our virtual 
universe? Will there eventually be some intelligence? And will it ask the same 
questions as we did at the beginning of our essay? Would its principal philoso- 
phical question be like this: “What was sooner: information, or the Universe?”, 
or: “Is there a computational theory of everything?” 

Today, at the doorstep to the third millennium, the issues mentioned above 
may sound to us as fantastically as would the questions from the beginning of 
this paper to men of science some 50 years ago. Or perhaps not? 
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Abstract. In a number of recent studies the question has arisen whether 
the familiar Church- Turing thesis is still adequate to capture the powers 
and limitations of modern computational systems. In this presentation 
we review two developments that may lead to an extension of the classical 
Turing machine paradigm: interactiveness, and global computing. 



Computers will soon be part of all objects around us. It will enable forms of 
algorithmic intelligence and communication of unprecedented versality. The que- 
stion arises whether the traditional notions in computability theory are still ade- 
quate to capture the powers and limitations of the new systems that are arising. 
In particular, some developments seem to challenge the view that computatio- 
nal systems are necessarily recursive. The present understandings of computation 
may transcend what is normally expressed by the Church- Turing thesis (see e.g. 
g] and 12]). In this presentation we will explore some possible ingredients for 
extending the traditional computational models. 

1 Interactive Computing 

Among the powerful computational notions in the large and in the small, is the 
notion of ‘interaction’. The notion typically applies to systems that consist of 
many components that compute and communicate with each other and with 
some external ‘environment’. 

The purpose of an interactive system is usually not to compute some final 
result but to react to or interact with the environment in which the system 
is placed and to maintain a well-defined action-reaction behavior. Interactive 
systems are always operating and thus may be seen as machines on infinite 
strings, but differ in the sense that their inputs are not specified and may depend 
on intermediate outputs and external sources. 

In the late nineteen seventies and early nineteen eighties, interactive (or: 
reactive) systems received much attention in the theory of concurrent processes 
(see e.g. m, 0)- Wegner |21I22| recently called for a more computational view 
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of reactive systems, claiming that they have a richer behavior than ‘algorithms’ 
as we know them. He writes (|22i P- 318): 

“The intuition that computing corresponds to formal computability by 
Turing machines . . . breaks down when the notion of what is computa- 
ble is broadened to include interaction. Though Church’s thesis is valid 
in the narrow sense that Turing machines express the behavior of algo- 
rithms, the broader assertion that algorithms precisely capture what can 
be computed is invalid. ” 

The study of ‘machines’ working on infinite input streams (w-words) is by no 
means new and has a sizeable literature, with the first studies dating back to 
the nineteen sixties and seventies (cf. Thomas |T^, Staiger d)- Nevertheless 
the notion of interactiveness adds a new perspective to it. For a discussion of 
Wegner’s claims from the viewpoint of computability theory, see Prasse and 
Rittgen mg). Formalizations of the theory of interaction are studied in Wegner 
and Goldin P3ESI, and Goldin jS|. 

In mi look at the computational implications of interaction. We give a 
simple model of interactive computing, consisting of one component C and an 
environment E interacting using single streams of input and output signals. The 
notion of ‘component’ that we use is very similar to that of a ‘site machine’ 
(see below). We identify a special condition, referred to as the interactiveness 
condition, which we impose throughout. Loosely speaking, the condition states 
that C is guaranteed to give some meaningful output within finite time any time 
after receiving a meaningful input from E and vice versa. 

In the model we prove a number of general results for the interactive compu- 
ting behaviour which a component C can exhibit, assuming that E can behave 
arbitrarily and unpredictably. We always assume that C is a program with un- 
bounded memory, with a memory contents that is building up over time and 
that is never erased (unless the component explicitly does so). To understand 
the operation of interactive components, one may define a a>-string x to be in- 
teractively recognizable if C only outputs I’s (possibly interspersed with silent 
periods of finite duration) when it is fed x during its dialogue with E. 

Example 1. The set J = {a S {0, l}‘^|o; contains finitely many I’s} can be seen 
not to be interactively recognizable. Suppose there was an interactive component 
C that recognized J. Let E input I’s. By interactiveness C must generate a non- 
empty signal a at some moment in time. E can now fool C as follows. If cr = 0, 
then let E switch to inputting O’s from this moment onward: it means that the 
resulting input belongs to J but C does not respond with all I’s. If cr = I, then 
let E continue to input I’s. Possibly C outputs a few more I’s but there must 
come a moment that it outputs a 0. If it didn’t then C would recognize the 
string ^ J . As soon as C outputs a 0, let E switch to inputting O’s from this 
moment onward: it means that the resulting input still belongs to J but C does 
not recognize it properly. Gontradiction. 

Viewing components as interactive transducers of the signals that they receive 
from their environment one can show the following analogue to similar results 
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from classical automata theory, using suitable definitions. The result is quite 
involved and heavily relies on the interactiveness condition. 

Theorem 1. A set J C {0, 1}“ is interactively recognizable if and only if it can 
be interactively generated. 

In PS| we also study a notion of interactively computable functions. We prove 
that interactively computable functions are limit-continuous, using a suitable 
extension of the notion of continuity known from the semantics of computa- 
ble functions. We also prove an interesting inversion theorem which states that 
interactively computable 1-1 functions have interactively computable inverses. 

2 Global Computing 

Among the new computational structures in the large, the Internet is beginning 
to play a very prominent role. Cardelli jSj has been among the first to realize 
the potential of the Internet as a programmable resource, suggesting that infor- 
mation structures may be realized on the Internet that can operate as ‘global 
computers’. He foresees a multitude of (different) global computers operating on 
the Web, interacting with users and geographically distributed resources. The 
present developments in ‘grid computing’ (Foster and Kesselman 0) are attemp- 
ting to make this operational for the purposes of high performance computing. 

Among the many questions raised by Cardelli |2|, is the question what mo- 
dels of computation are appropriate for global computing. The Internet is an 
infrastructure for highly distributed forms of information exchange and com- 
putation, allowing large varying numbers of autonomous software entities to 
interact, compute and evolve under unpredictable circumstances. It is an envi- 
ronment in which the programs may depend on the data they have to operate 
on, the inputs are not given in advance and the computations may differ depen- 
ding on the influence from unpredictable sources. This gives a computational 
power more akin to that of various ‘non-uniform’ computational models (cf. Q), 
implying a power beyond that of ordinary Turing machines. 

In we describe a model of global computing in two stages. First ‘site 
machines’ are defined, by augmenting Turing machines with a communication 
facility. Next we define global Turing machines (or ‘internet machines’) as finite 
but time-varying sets of communicating site machines that can compute ad in- 
finitum, modeling that a global computer may be evolving over time without 
limit. Under mild assumptions, the following theorem can be proved. 

Theorem 2. For every global Turing machine A4 there exists a Turing machine 
T with advice that sequentially realizes the same computations as A4 does, and 
vice versa. 

For an indication of the non-conventional power of Turing machines with 
advice we refer to Schoning Cl and B alcazar et al Q. In c we also make a 
first step towards the development of a complexity theory for ‘global space’ and 
‘global time’. 
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3 Conclusion 

Several developments in the theory of computation have challenged the kinds of 
computational models we are currently using (see |2], |1 dj). The given examples 
of interactive and global computing indicate that the classical Turing machine 
paradigm should be revised (extended) in order to capture the forms of com- 
putation that one observes in the systems and networks in modern information 
technology. 
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Abstract. It may seem that there are almost as many varieties of pro- 
gramming language semantics as there are of programming languages. 
This brief summary surveys the main varieties of semantics, and consi- 
ders which of them may be the most appropriate. 



1 Introduction 

In 1972, Christopher Strachey wrote the paper The Varieties of Programming 
Language j77] . (The title of the paper was by analogy with that of a book by 
William James: The Varieties of Religious Experience pm.) Strachey, in collabo- 
ration with Dana Scott, had developed the semantic description framework that 
became known as denotational semantics, and in the paper he showed how the 
relationship between the domains of so-called denotable, expressible, and stor- 
able values used in denotational semantics could reveal major characteristics of 
the described language. 

In fact Strachey had an almost religious conviction in the superiority of the 
denotational approach to semantics over the operational and axiomatic appro- 
aches. He found it most natural to represent programming constructs as pure 
mathematical functions, and Scott-domains provided the foundations for his use 
of Curry’s “paradoxical combinator” for obtaining fixed points of higher-order 
functions. Sadly, Strachey’s work on the development of denotational semantics 
was cut short by his untimely death in 1975. His insight and ideas have nevert- 
heless had a profound and lasting impact, as witnessed this year by the special 
issue of HOSC ^ published to commemorate the 25th anniversary of his de- 
ath. The present author is particularly indebted to Strachey for the inspiration 
and guidance he provided during doctoral studies at the Programming Research 
Group in Oxford. 

However, the proliferation of different frameworks for semantic description of 
programming languages over the past three decades makes one wonder whether 
Strachey’s satisfaction with denotational semantics was fully justified. Let us 
(here, very briefly) survey the main varieties of semantics, and then consider 
which may be the most appropriate for various uses. 
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2 Operational Semantics 

The main idea of the operational approach to semantics is to model computations 
step-by-step. Operational frameworks include: 

— translation to code for abstract machines such as the SECD-machine H2] or 
that of VDL PI] ; 

— translation to the A-calculus H2|, which is then evaluated to normal form; 

— application of abstract interpreters, which operate on a direct representation 
of the structure of the program cni; 

— application of sets of term rewriting rules, using evaluation contexts ^ to 
control the order of sub-computations; and 

— syntax-directed inference rules for transitions between configurations UCH 
E2E1, where “small-step” transitions model single steps of information pro- 
cessing, whereas “big-step” transitions model entire (terminating) computa- 
tions, corresponding to the transitive closure of a small-step relation. 



3 Denotational Semantics 

Here semantic functions, generally defined inductively by semantic equations, 
map programs to their denotations, which represent computations as functions 
taking initial states directly to final states without revealing intermediate steps 

PHI2E!. 

Variations on this theme include: 

— initial algebra semantics [^, where (abstract) syntax is an initial algebra in a 
class of algebras, and where semantic equations are replaced by the definition 
of a target algebra interpreting the syntactic constructs; 

— monadic semantics HM, where for each type of value, there is the type of 
computations of such values, and where the values and computations form 
a monad (equipped with further operations); and 

— translation between meta-languages UBI, where the semantic functions map 
programs to meta-language terms that may then be interpreted as functions 
in monads. 

4 Axiomatic Semantics 

The main idea here is to exploit assertions about programs: 

— Hoare Logic uses inference rules to relate assertions about the values of 
variables before and after program phrases; 

— weakest-precondition semantics 0 is essentially denotational, with denota- 
tions being functions from assertions about values of variables after compu- 
tations to assertions about the values beforehand; and 

— laws of programming I2ni characterize computation by assertions about pro- 
gram equivalence. 
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5 Hybrid Approaches 

The final varieties of semantics considered here combine features of operational, 
denotational, and axiomatic semantics: 

— action semantics f20l2,'l,Sn| is similar to monadic denotational semantics, 
but denotations are actions rather than functions, and the notation used for 
composing computations is interpreted operationally; 

— modular monadic action semantics m provides the action notation used in 
action semantics with a monadic denotational semantics; 

— type-theoretic interpretation 0 uses evaluation contexts and reduction rules 
to define an internal language for composing computations; 

— specification frameworks such as OBJ3 and Rewriting Logic can be 
used to define interpreters and transition relations; and 

— mathematical operational semantics |‘2S] allows definition by rules that pro- 
vide both an operational and a denotational semantics. 



6 Conclusion 

To give a semantic description of a full high-level programming in any framework 
requires a major effort. Some of the varieties of semantics referenced above reduce 
the effort required by allowing re-use of parts of descriptions of other languages; 
these include modular SOS, monadic denotational semantics, action semantics, 
and modular monadic action semantics. Other important pragmatic aspects of 
semantic descriptions concern the ease with which they can be used for setting 
standards for implementations (or merely for communication of the designers’ 
intentions to implementors), program verification, and compiler generation. 

It appears that at present, no single semantic framework is ideal with regard 
to all pragmatic aspects. Sometimes, it is suggested that one should therefore 
aim provide two or more complementary semantic descriptions of each language, 
in different styles — and prove that they are equivalent — but the extra effort 
involved would surely be quite prohibitive when describing the semantics of 
larger programming languages. 

In the opinion of this author, it is generally advantageous to translate a 
(large and complex) programming language into a (smaller and simpler) nota- 
tion for semantic entities, as in denotational semantics, and in the hybrid action 
semantics and type-theoretic interpretation frameworks. However, when the di- 
stance between the programming language and the semantic notation is large, 
the translation itself may be uncomfortably complex; when it is small, the se- 
mantic notation may be almost as difficult to define or reason about as the 
programming language. The matter of how best to define the semantic notation 
itself appears to be less crucial, especially if the same semantic notation can be 
exploited in the description of many different programming languages. 
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